Re: Orphaned Inode Problem

2024-02-22 Thread Jörg-Volker Peetz

Henning Follmann wrote on 22/02/2024 08:43:


You didn't answer where you read that. I would be interested in that. I do
not claim to be an expert on this and I would like to understand it better.

-H


Concededly, I didn't noted that down. It was a discussion like in this blog:

https://forums.linuxmint.com/viewtopic.php?t=349099

Regards,
Jörg.



Re: Orphaned Inode Problem

2024-02-21 Thread Henning Follmann
On Wed, Feb 21, 2024 at 05:15:55PM +0100, Jörg-Volker Peetz wrote:
> Henning Follmann wrote on 21/02/2024 14:16:
> > On Wed, Feb 21, 2024 at 12:00:17PM +0100, Jörg-Volker Peetz wrote:
> 
> > > Somewhere I read, for maintainance of an SSD all it's cells should be read
> > > from time to time like this
> > > 
> > > sudo dd if=/dev/DEVICE of=/dev/null bs=8M status=progress
> > 
> > Where did you read that? That seems like a huge waste of time.
> > 
> As far as I remember, the idea behind this suggestion is to help the SSD
> firmware detect bad blocks or cells early on and to mask them out. Of
> course, a good firmware with it's wear leveling algorithm
> (https://en.wikipedia.org/wiki/Wear_leveling) should do this by itself.
> 
You didn't answer where you read that. I would be interested in that. I do
not claim to be an expert on this and I would like to understand it better.

-H


-- 
Henning Follmann   | hfollm...@itcfollmann.com



Re: Orphaned Inode Problem

2024-02-21 Thread tomas
On Wed, Feb 21, 2024 at 05:15:55PM +0100, Jörg-Volker Peetz wrote:
> Henning Follmann wrote on 21/02/2024 14:16:
> > On Wed, Feb 21, 2024 at 12:00:17PM +0100, Jörg-Volker Peetz wrote:
> 
> > > Somewhere I read, for maintainance of an SSD all it's cells should be read
> > > from time to time like this
> > > 
> > > sudo dd if=/dev/DEVICE of=/dev/null bs=8M status=progress
> > 
> > Where did you read that? That seems like a huge waste of time.
> > 
> As far as I remember, the idea behind this suggestion is to help the SSD
> firmware detect bad blocks or cells early on and to mask them out. Of
> course, a good firmware with it's wear leveling algorithm
> (https://en.wikipedia.org/wiki/Wear_leveling) should do this by itself.

Actually... you only have to read regularly those blocks which are
known to have stuff in them. The file system should know which those
are, that's its job.

And then, this is a backup, at least in my book, and yes, you should
do that regularly, even on spinning rust ;-)

Cheers
-- 
t


signature.asc
Description: PGP signature


Re: Orphaned Inode Problem

2024-02-21 Thread gene heskett

On 2/21/24 08:17, Henning Follmann wrote:

On Wed, Feb 21, 2024 at 12:00:17PM +0100, Jörg-Volker Peetz wrote:

Hi,

did you take a look at the smartctl output?

Somewhere I read, for maintainance of an SSD all it's cells should be read
from time to time like this

sudo dd if=/dev/DEVICE of=/dev/null bs=8M status=progress


Where did you read that? That seems like a huge waste of time.



where device is something like sda or nvme0n1, especially if it was switched
off for a longer period. At least, it shows the current read performance of
the device.
An SSD should regularly be trimmed, if in use. This is to assist it's wear
leveling process.


If you should manually kick off trim is a hotly debated issue.
It mainly depends on the use of the drive.
In most cases however do not alter any of how the system was install by
your friendly installer.


That actually might be a good idea, as it will force a read of 
everything, which will trigger a fixit it for any cell that does read 
right on the first try.


OTOH, my pi's only get powered down for maintenance, so they've got lots 
of spare time to do their thing when you are not looking, And i've not 
lost a pi u-sd in quite a few years. So even though the system, with all 
the trash collected over a decade might amount to 10G's, they have 64G 
to play with. I must be doing something right.

What's your opinion?

How much time do you have :)


-H





Cheers, Gene Heskett, CET.
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis



Re: Orphaned Inode Problem

2024-02-21 Thread Gremlin

On 2/21/24 13:14, David Christensen wrote:

On 2/21/24 03:00, Jörg-Volker Peetz wrote:

Hi,

did you take a look at the smartctl output?

Somewhere I read, for maintainance of an SSD all it's cells should be 
read from time to time like this


sudo dd if=/dev/DEVICE of=/dev/null bs=8M status=progress

where device is something like sda or nvme0n1, especially if it was 
switched off for a longer period. At least, it shows the current read 
performance of the device.
An SSD should regularly be trimmed, if in use. This is to assist it's 
wear leveling process.


What's your opinion?

Regards,
Jörg.



I prefer to run a SMART long test periodically.  This should read every 
cell, including those that are reserved and not visible to the OS.



AIUI So long as the SSD can maintain a supply of erased cells via 
manufacturer over-provisioning, trim is not required to maintain 
performance.  If you have workload that does a lot of writes in a short 
period of time and exhausts the manufacturer over-provisioning, leaving 
free space on the SSD and trimming can be a work-around.



If you are using strong encryption, not trimming will leave crypttext on 
disk that creates more work for an attacker.  If you are using weak 
encryption, not trimming will leave crypttext on disk that an attacker 
can recover.



For imaging/ cloning, trimming will zero blocks freed by the OS and 
facilitate compression of the image file.



I have a SOHO network with about two dozen disks.  Running smartctl by 
hand is a PITA.  Running fstrim(8) by hand is easy enough.  I try to do 
both once a month.  I need to figure out smartd(8).



David




#!/usr/bin/dash -
drives="$(lsblk|grep '^sd')"
for i in $drives;do
case $i in
sd*) sudo smartctl -a /dev/"$i" ;;
*)   : ;;
esac
done




Re: Orphaned Inode Problem

2024-02-21 Thread David Christensen

On 2/21/24 03:00, Jörg-Volker Peetz wrote:

Hi,

did you take a look at the smartctl output?

Somewhere I read, for maintainance of an SSD all it's cells should be 
read from time to time like this


sudo dd if=/dev/DEVICE of=/dev/null bs=8M status=progress

where device is something like sda or nvme0n1, especially if it was 
switched off for a longer period. At least, it shows the current read 
performance of the device.
An SSD should regularly be trimmed, if in use. This is to assist it's 
wear leveling process.


What's your opinion?

Regards,
Jörg.



I prefer to run a SMART long test periodically.  This should read every 
cell, including those that are reserved and not visible to the OS.



AIUI So long as the SSD can maintain a supply of erased cells via 
manufacturer over-provisioning, trim is not required to maintain 
performance.  If you have workload that does a lot of writes in a short 
period of time and exhausts the manufacturer over-provisioning, leaving 
free space on the SSD and trimming can be a work-around.



If you are using strong encryption, not trimming will leave crypttext on 
disk that creates more work for an attacker.  If you are using weak 
encryption, not trimming will leave crypttext on disk that an attacker 
can recover.



For imaging/ cloning, trimming will zero blocks freed by the OS and 
facilitate compression of the image file.



I have a SOHO network with about two dozen disks.  Running smartctl by 
hand is a PITA.  Running fstrim(8) by hand is easy enough.  I try to do 
both once a month.  I need to figure out smartd(8).



David



Re: Orphaned Inode Problem

2024-02-21 Thread Jörg-Volker Peetz

Henning Follmann wrote on 21/02/2024 14:16:

On Wed, Feb 21, 2024 at 12:00:17PM +0100, Jörg-Volker Peetz wrote:



Somewhere I read, for maintainance of an SSD all it's cells should be read
from time to time like this

sudo dd if=/dev/DEVICE of=/dev/null bs=8M status=progress


Where did you read that? That seems like a huge waste of time.

As far as I remember, the idea behind this suggestion is to help the SSD 
firmware detect bad blocks or cells early on and to mask them out. Of course, a 
good firmware with it's wear leveling algorithm 
(https://en.wikipedia.org/wiki/Wear_leveling) should do this by itself.


Regards,
Jörg.




Re: Orphaned Inode Problem

2024-02-21 Thread Henning Follmann
On Wed, Feb 21, 2024 at 12:00:17PM +0100, Jörg-Volker Peetz wrote:
> Hi,
> 
> did you take a look at the smartctl output?
> 
> Somewhere I read, for maintainance of an SSD all it's cells should be read
> from time to time like this
> 
> sudo dd if=/dev/DEVICE of=/dev/null bs=8M status=progress

Where did you read that? That seems like a huge waste of time.

> 
> where device is something like sda or nvme0n1, especially if it was switched
> off for a longer period. At least, it shows the current read performance of
> the device.
> An SSD should regularly be trimmed, if in use. This is to assist it's wear
> leveling process.
> 
If you should manually kick off trim is a hotly debated issue.
It mainly depends on the use of the drive.
In most cases however do not alter any of how the system was install by
your friendly installer.


> What's your opinion?
How much time do you have :)


-H



-- 
Henning Follmann   | hfollm...@itcfollmann.com



Re: Orphaned Inode Problem

2024-02-21 Thread Jörg-Volker Peetz

Hi,

did you take a look at the smartctl output?

Somewhere I read, for maintainance of an SSD all it's cells should be read from 
time to time like this


sudo dd if=/dev/DEVICE of=/dev/null bs=8M status=progress

where device is something like sda or nvme0n1, especially if it was switched off 
for a longer period. At least, it shows the current read performance of the device.
An SSD should regularly be trimmed, if in use. This is to assist it's wear 
leveling process.


What's your opinion?

Regards,
Jörg.




Re: red SATA cables "notoriously bad"? (Was Re: Orphaned Inode Problem)

2024-02-20 Thread Eike Lantzsch ZP5CGE / KY4PZ
On Dienstag, 20. Februar 2024 06:58:31 -03 Eike Lantzsch ZP5CGE / KY4PZ
wrote:
> On Montag, 19. Februar 2024 21:48:52 -03 Andy Smith wrote:
> > Hi,
> >
> > On Mon, Feb 19, 2024 at 04:12:44PM -0300, Eike Lantzsch ZP5CGE /
> > KY4PZ
> wrote:
> > > The notorious red SATA cables - I threw them out long ago. The red
> > > pigment eats up the fine copper threads, changing the impedance of
> > > the cable and eventually making false contact before failing
> > > completely.
> >
> > I've never heard of this. I did a bit of searching around and all I
> > can find is assertions that cable colour doesn't matter for SATA. I
> > can't seem to find anything about red pigment damaging the copper.
> > Have you got a reference so I can learn more?
> >
> > Thanks,
> > Andy
>
> Experience ...
> "notoriously bad" on my work bench.
> Audio cables, SATA cables, even red cables of 1.5mm2 upwards. The
> corrosion can be seen although it takes decades for the thicker cables
> to deteriorate.
> It very much depends on where the cables have been manufactured.
> Never had problems with European made or US made telephone cables with
> wires with red sheeths. But copper cable manufacturing has been
> outsourced to Asia (and Argentina - Pirelli but those are good) many
> decades ago.

If you open the sheeth of the red SATA cable, you will see that at least
three wires have no extra sheeth but are directly embedded into the red
plastic. 4 are shielded. So I guess that those wires are not affected.
There is one naked wire in the middle and two to the right and left.

--
Eike Lantzsch KY4PZ / ZP5CGE





Re: red SATA cables "notoriously bad"? (Was Re: Orphaned Inode Problem)

2024-02-20 Thread Eike Lantzsch ZP5CGE / KY4PZ
On Montag, 19. Februar 2024 21:48:52 -03 Andy Smith wrote:
> Hi,
>
> On Mon, Feb 19, 2024 at 04:12:44PM -0300, Eike Lantzsch ZP5CGE / KY4PZ
wrote:
> > The notorious red SATA cables - I threw them out long ago. The red
> > pigment eats up the fine copper threads, changing the impedance of
> > the cable and eventually making false contact before failing
> > completely.
> I've never heard of this. I did a bit of searching around and all I
> can find is assertions that cable colour doesn't matter for SATA. I
> can't seem to find anything about red pigment damaging the copper.
> Have you got a reference so I can learn more?
>
> Thanks,
> Andy

Experience ...
"notoriously bad" on my work bench.
Audio cables, SATA cables, even red cables of 1.5mm2 upwards. The
corrosion can be seen although it takes decades for the thicker cables
to deteriorate.
It very much depends on where the cables have been manufactured.
Never had problems with European made or US made telephone cables with
wires with red sheeths. But copper cable manufacturing has been
outsourced to Asia (and Argentina - Pirelli but those are good) many
decades ago.

--
Eike Lantzsch KY4PZ / ZP5CGE





Re: red SATA cables "notoriously bad"? (Was Re: Orphaned Inode Problem)

2024-02-19 Thread jeremy ardley



On 20/2/24 08:48, Andy Smith wrote:

Hi,

On Mon, Feb 19, 2024 at 04:12:44PM -0300, Eike Lantzsch ZP5CGE / KY4PZ wrote:

The notorious red SATA cables - I threw them out long ago. The red
pigment eats up the fine copper threads, changing the impedance of the
cable and eventually making false contact before failing completely.

I've never heard of this. I did a bit of searching around and all I
can find is assertions that cable colour doesn't matter for SATA. I
can't seem to find anything about red pigment damaging the copper.
Have you got a reference so I can learn more?



I find it unlikely that the color of the outer sheath of a cable affects 
the conductors as they have their own individual sheaths usually of a 
different material to the sheath.


It's possible that some manufacturer made cables with faulty individual 
insulation and their brand used a red outer sheath. In that case the 
color of the sheath correlates with faulty cables but is not the cause 
of the faulty cables.




Re: red SATA cables "notoriously bad"? (Was Re: Orphaned Inode Problem)

2024-02-19 Thread gene heskett

On 2/19/24 19:49, Andy Smith wrote:

Hi,

On Mon, Feb 19, 2024 at 04:12:44PM -0300, Eike Lantzsch ZP5CGE / KY4PZ wrote:

The notorious red SATA cables - I threw them out long ago. The red
pigment eats up the fine copper threads, changing the impedance of the
cable and eventually making false contact before failing completely.


I've never heard of this. I did a bit of searching around and all I
can find is assertions that cable colour doesn't matter for SATA. I
can't seem to find anything about red pigment damaging the copper.
Have you got a reference so I can learn more?

Thanks,
Andy


Andy, I am the source of that red cable story. Actually it is not 
technically a red but a magenta that fluoresces reddish to get that 
brightness.  And my history with early failure of cables that used that 
dye to color the insulation goes back to the 1970's when the majority of 
the CB radios sold were from japan, not china.  Microphone cables that 
included a push to talk start failing quite rapidly, The hot red wire 
was used for that about 99% of the time..Open up the plug or the 
microphone, the red wire had come unsoldered or broken off, attempt to 
strip it back to good wire wasn't possible. there was no good copper 
left anyplace in the cable. Cut an inch of it off where there should 
have been copper, grab it by the end with suture clamps and thump it 
with a pencil over white copy paper and shake the copper out of it as a 
reddish, face powder fine dust, the copper had been I assume made into 
copper oxide. It took every good tech in the country to start returning 
mike cables back to the makers as defective before they got the message 
that that die was poison. That took about 9 months before we could order 
replacement cable specifing that they would be returned for credit if we 
found any 'hot" red in the cables they were selling us. The shortage at 
the time forced them to ship whatever they had I guess.  If you goto 
Loews or any electrical supply where they have to sell NEC approved 
cabling, you will NOT see that red on any wire on the shelf or in the 
rack. Then about the time sata came out, they found a new market for 
that plastic dye, and sure as heck, we had cabling problems out the yang 
in about 3 years. If you have that hot red wire anyplace in you 
computer, it will fail, order more cables.  Tan, Black, Yellow, but not 
hot red.  And sleep better knowing that time bomb has gone out with the 
trash.


Cheers, Gene Heskett, CET.
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
 - Louis D. Brandeis



red SATA cables "notoriously bad"? (Was Re: Orphaned Inode Problem)

2024-02-19 Thread Andy Smith
Hi,

On Mon, Feb 19, 2024 at 04:12:44PM -0300, Eike Lantzsch ZP5CGE / KY4PZ wrote:
> The notorious red SATA cables - I threw them out long ago. The red
> pigment eats up the fine copper threads, changing the impedance of the
> cable and eventually making false contact before failing completely.

I've never heard of this. I did a bit of searching around and all I
can find is assertions that cable colour doesn't matter for SATA. I
can't seem to find anything about red pigment damaging the copper.
Have you got a reference so I can learn more?

Thanks,
Andy

-- 
https://bitfolk.com/ -- No-nonsense VPS hosting



Re: Orphaned Inode Problem

2024-02-19 Thread Eike Lantzsch ZP5CGE / KY4PZ
On Montag, 19. Februar 2024 14:20:52 -03 to...@tuxteam.de wrote:
> On Mon, Feb 19, 2024 at 10:02:10AM -0500, Stephen P. Molnar wrote:
> > I am running up to date Bookworm on my Debian platform:
> >
> > Processor   AMD FX(tm)-8320 Eight-Core Processor
> > Memory  8026MB (5267MB used)
> > Machine TypeDesktop
> > Operating SystemDebian GNU/Linux 12 (bookworm)
> >
> > I have been plagued with orphaned inodes. Last night the problem
> > cane to a head. When I reboot the computer, after an orphaned inode
> > incident created stop, it got as far as the user login. After the
> > return I got the Windows infamous blue screen. Restarting produced
> > the same problem.
> >
> > Fortunately, I have another SSD used to test Bookworm, before
> > updating on the SSD that is having the problem. I can access the
> > problem drive and am in the process of backing up files.
> >
> > I ran sudo e2fsck -f/dev/sdc1 and got:
> >
> > Script started on 2024-02-19 08:15:52-05:00 [TERM="xterm-256color"
> > TTY="/dev/pts/0" COLUMNS="100" LINES="24"]
> > [?2004h(base) ]0;comp@AbNormal:
> > ~comp@AbNormal:~$ sudo e2fsck -f
> > /dev/sdc1lcaomosudo e2fsck -f
> > /dev/sdc1 [?2004l
> > [sudo] password for comp:
> > e2fsck 1.47.0 (5-Feb-2023)
> > Pass 1: Checking inodes, blocks, and sizes
> > Pass 2: Checking directory structure
> > Pass 3: Checking directory connectivity
> > /lost+found not found.  Create? yes
> > Pass 4: Checking reference counts
> > Pass 5: Checking group summary information
> >
> > /dev/sdc1: * FILE SYSTEM WAS MODIFIED *
> > /dev/sdc1: 7982363/121577472 files (0.3% non-contiguous),
> > 421959365/486307328 blocks
> > [?2004h(base) ]0;comp@AbNormal:
> > ~comp@AbNormal:~$ [?2004l
> >
> > Comments and suggestions will be appreciated.
>
> This session doesn't show anything to worry about. As far as fsck
> is concerned, the file system looks clean. Back up its contents as
> quickly as you can and treat the disk with suspicion. There are
> other candidate suspects for file system corruption (flaky power
> supply, software doing silly things, kernel bugs, loose cables),
> but the disk would be the pirmary.
>
> Cheers

Just as an aside note:
The notorious red SATA cables - I threw them out long ago. The red
pigment eats up the fine copper threads, changing the impedance of the
cable and eventually making false contact before failing completely.
Of course this does not apply to NVME SSDs.

--
Eike Lantzsch KY4PZ / ZP5CGE





Re: Orphaned Inode Problem

2024-02-19 Thread tomas
On Mon, Feb 19, 2024 at 12:30:30PM -0500, Stephen P. Molnar wrote:

[...]

> Thanks for he reply. It's somewhat reassuring.
> 
> According to my logs the box had its' last major  last upgrade in 2014, so I
> shouldn't be too surprised.
> 
> My backup is underweight and should be done sometime tomorrow.  I have a 2
> TB HDD I'm going to use for the new install.

Fingers crossed...

Cheers
-- 
t


signature.asc
Description: PGP signature


Re: Orphaned Inode Problem

2024-02-19 Thread Stephen P. Molnar



On 02/19/2024 12:20 PM, to...@tuxteam.de wrote:

On Mon, Feb 19, 2024 at 10:02:10AM -0500, Stephen P. Molnar wrote:

I am running up to date Bookworm on my Debian platform:

Processor   AMD FX(tm)-8320 Eight-Core Processor
Memory  8026MB (5267MB used)
Machine TypeDesktop
Operating SystemDebian GNU/Linux 12 (bookworm)

I have been plagued with orphaned inodes. Last night the problem cane to a
head. When I reboot the computer, after an orphaned inode incident created
stop, it got as far as the user login. After the return I got the Windows
infamous blue screen. Restarting produced the same problem.

Fortunately, I have another SSD used to test Bookworm, before updating on
the SSD that is having the problem. I can access the problem drive and am in
the process of backing up files.

I ran sudo e2fsck -f/dev/sdc1 and got:

Script started on 2024-02-19 08:15:52-05:00 [TERM="xterm-256color"
TTY="/dev/pts/0" COLUMNS="100" LINES="24"]
[?2004h(base) ]0;comp@AbNormal:
~comp@AbNormal:~$ sudo e2fsck -f
/dev/sdc1lcaomosudo e2fsck -f /dev/sdc1
[?2004l
[sudo] password for comp:
e2fsck 1.47.0 (5-Feb-2023)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found.  Create? yes
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/sdc1: * FILE SYSTEM WAS MODIFIED *
/dev/sdc1: 7982363/121577472 files (0.3% non-contiguous),
421959365/486307328 blocks
[?2004h(base) ]0;comp@AbNormal:
~comp@AbNormal:~$ [?2004l

Comments and suggestions will be appreciated.

This session doesn't show anything to worry about. As far as fsck
is concerned, the file system looks clean. Back up its contents as
quickly as you can and treat the disk with suspicion. There are
other candidate suspects for file system corruption (flaky power
supply, software doing silly things, kernel bugs, loose cables),
but the disk would be the pirmary.

Cheers

Thanks for he reply. It's somewhat reassuring.

According to my logs the box had its' last major  last upgrade in 2014, 
so I shouldn't be too surprised.


My backup is underweight and should be done sometime tomorrow.  I have a 
2 TB HDD I'm going to use for the new install.


--
Stephen P. Molnar, Ph.D.
https://insilicochemistry.net
(614)312-7528 (c)
Skype:  smolnar1



Re: Orphaned Inode Problem

2024-02-19 Thread tomas
On Mon, Feb 19, 2024 at 10:02:10AM -0500, Stephen P. Molnar wrote:
> I am running up to date Bookworm on my Debian platform:
> 
> Processor AMD FX(tm)-8320 Eight-Core Processor
> Memory8026MB (5267MB used)
> Machine Type  Desktop
> Operating System  Debian GNU/Linux 12 (bookworm)
> 
> I have been plagued with orphaned inodes. Last night the problem cane to a
> head. When I reboot the computer, after an orphaned inode incident created
> stop, it got as far as the user login. After the return I got the Windows
> infamous blue screen. Restarting produced the same problem.
> 
> Fortunately, I have another SSD used to test Bookworm, before updating on
> the SSD that is having the problem. I can access the problem drive and am in
> the process of backing up files.
> 
> I ran sudo e2fsck -f/dev/sdc1 and got:
> 
> Script started on 2024-02-19 08:15:52-05:00 [TERM="xterm-256color"
> TTY="/dev/pts/0" COLUMNS="100" LINES="24"]
> [?2004h(base) ]0;comp@AbNormal:
> ~comp@AbNormal:~$ sudo e2fsck -f
> /dev/sdc1lcaomosudo e2fsck -f /dev/sdc1
> [?2004l
> [sudo] password for comp:
> e2fsck 1.47.0 (5-Feb-2023)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> /lost+found not found.  Create? yes
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> 
> /dev/sdc1: * FILE SYSTEM WAS MODIFIED *
> /dev/sdc1: 7982363/121577472 files (0.3% non-contiguous),
> 421959365/486307328 blocks
> [?2004h(base) ]0;comp@AbNormal:
> ~comp@AbNormal:~$ [?2004l
> 
> Comments and suggestions will be appreciated.

This session doesn't show anything to worry about. As far as fsck
is concerned, the file system looks clean. Back up its contents as
quickly as you can and treat the disk with suspicion. There are
other candidate suspects for file system corruption (flaky power
supply, software doing silly things, kernel bugs, loose cables),
but the disk would be the pirmary.

Cheers
-- 
t


signature.asc
Description: PGP signature


Orphaned Inode Problem

2024-02-19 Thread Stephen P. Molnar

I am running up to date Bookworm on my Debian platform:

Processor   AMD FX(tm)-8320 Eight-Core Processor
Memory  8026MB (5267MB used)
Machine TypeDesktop
Operating SystemDebian GNU/Linux 12 (bookworm)

I have been plagued with orphaned inodes. Last night the problem cane to 
a head. When I reboot the computer, after an orphaned inode incident 
created stop, it got as far as the user login. After the return I got 
the Windows infamous blue screen. Restarting produced the same problem.


Fortunately, I have another SSD used to test Bookworm, before updating 
on the SSD that is having the problem. I can access the problem drive 
and am in the process of backing up files.


I ran sudo e2fsck -f/dev/sdc1 and got:

Script started on 2024-02-19 08:15:52-05:00 [TERM="xterm-256color" 
TTY="/dev/pts/0" COLUMNS="100" LINES="24"]
[?2004h(base) ]0;comp@AbNormal: 
~comp@AbNormal:~$ sudo e2fsck -f 
/dev/sdc1lcaomosudo e2fsck -f /dev/sdc1

[?2004l
[sudo] password for comp:
e2fsck 1.47.0 (5-Feb-2023)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
/lost+found not found.  Create? yes
Pass 4: Checking reference counts
Pass 5: Checking group summary information

/dev/sdc1: * FILE SYSTEM WAS MODIFIED *
/dev/sdc1: 7982363/121577472 files (0.3% non-contiguous), 
421959365/486307328 blocks
[?2004h(base) ]0;comp@AbNormal: 
~comp@AbNormal:~$ [?2004l


Comments and suggestions will be appreciated.

Thanks in advance.

--
Stephen P. Molnar, Ph.D.
https://insilicochemistry.net
(614)312-7528 (c)
Skype:  smolnar1



Re: Orphaned Inode Problem

2019-11-20 Thread Stephen P. Molnar

Well, as I said, I didn't know if it meant anything.

On 11/20/2019 11:01 AM, Reco wrote:

Hi.

On Wed, Nov 20, 2019 at 09:19:30AM -0500, Stephen P. Molnar wrote:

I don't know what the significance might be, but I have installed
Buster in an Oracle VM along with the software that hangs, and it
works.

Countless things could be significant here. If you remove a real
hardware from the equation, then you remove whole classes of problems.

Reco




--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1



Re: Orphaned Inode Problem

2019-11-20 Thread Reco
Hi.

On Wed, Nov 20, 2019 at 09:19:30AM -0500, Stephen P. Molnar wrote:
> I don't know what the significance might be, but I have installed
> Buster in an Oracle VM along with the software that hangs, and it
> works.

Countless things could be significant here. If you remove a real
hardware from the equation, then you remove whole classes of problems.

Reco 



Re: Orphaned Inode Problem

2019-11-20 Thread Stephen P. Molnar
I don't know what the significance might be, but I have installed Buster 
in an Oracle VM along with the software that hangs, and it works.


On 11/19/2019 02:39 PM, Reco wrote:

Hi.

On Tue, Nov 19, 2019 at 02:31:59PM -0500, Stephen P. Molnar wrote:

On Mon, Nov 18, 2019 at 02:06:48PM -0500, Stephen P. Molnar wrote:

he problem is that the program hangs and the system will not
recognized the keyboard, although, according to gKrellM the system is
still operating. The only solution seems to be to reboot the system.

The contents of /var/log/messages at the time of the hang will
definitely help to pinpoint the issue.

And maybe the xorg.log, but it's non-trivial to extract something useful
from it - you have to wait for the hang, reboot, and locate
Xorg.0.log.old file.

I've attached the dmesg file.  The platfrom was locked up and i had to reboot 
the system to get the file.

I wrote "/var/log/messages", not "dmesg" for a reason.
And that reason is - dmesg shows current kernel messages (i.e. - after
the reboot), and they are useless for determining the cause of the hang.

/var/log/messages can be large, but I do not ask all of it. A part
that precedes the hang is all that needed.


For the archives, the last line in dmesg output is:


[   23.210107] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready

And the dmesg itself shows more-or-less normal boot process and uptime
of 23 seconds.

Reco




--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1



Re: Orphaned Inode Problem

2019-11-19 Thread Ben Caradoc-Davies

On 19/11/2019 08:17, Reco wrote:

A kernel panic or OOPS comes to mind first. That's very broad class of
the problem, to say the least, hence the need of kernel logs.
Xorg hang is the second possible option. AMD hardware is somewhat
problematic here.
Barring above - an overheat is the third possible scenario here.


Memory errors under load can also cause kernel panics. I like to test 
RAM by running multiple concurrent memtester processes to simulate 
multithreaded load.


Some AMD platforms had a reputation for instability with anything less 
than the highest performance RAM.


mprime (Prime95) is useful for stress testing a CPU.

Kind regards,

--
Ben Caradoc-Davies 
Director
Transient Software Limited 
New Zealand



Re: Orphaned Inode Problem

2019-11-19 Thread Reco
Hi.

On Tue, Nov 19, 2019 at 02:31:59PM -0500, Stephen P. Molnar wrote:
> > On Mon, Nov 18, 2019 at 02:06:48PM -0500, Stephen P. Molnar wrote:
> > > he problem is that the program hangs and the system will not
> > > recognized the keyboard, although, according to gKrellM the system is
> > > still operating. The only solution seems to be to reboot the system.
> > The contents of /var/log/messages at the time of the hang will
> > definitely help to pinpoint the issue.
> > 
> > And maybe the xorg.log, but it's non-trivial to extract something useful
> > from it - you have to wait for the hang, reboot, and locate
> > Xorg.0.log.old file.
> 
> I've attached the dmesg file.  The platfrom was locked up and i had to reboot 
> the system to get the file.

I wrote "/var/log/messages", not "dmesg" for a reason.
And that reason is - dmesg shows current kernel messages (i.e. - after
the reboot), and they are useless for determining the cause of the hang.

/var/log/messages can be large, but I do not ask all of it. A part
that precedes the hang is all that needed.


For the archives, the last line in dmesg output is:

> [   23.210107] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready

And the dmesg itself shows more-or-less normal boot process and uptime
of 23 seconds.

Reco



Re: Orphaned Inode Problem

2019-11-19 Thread Stephen P. Molnar



On 11/18/2019 02:17 PM, Reco wrote:

Hi.

On Mon, Nov 18, 2019 at 02:06:48PM -0500, Stephen P. Molnar wrote:

he problem is that the program hangs and the system will not
recognized the keyboard, although, according to gKrellM the system is
still operating. The only solution seems to be to reboot the system.

The contents of /var/log/messages at the time of the hang will
definitely help to pinpoint the issue.

And maybe the xorg.log, but it's non-trivial to extract something useful
from it - you have to wait for the hang, reboot, and locate
Xorg.0.log.old file.



I have no idea what the cause may be or what a solution might be.
Google is no help (al least, nothing that I can understand).

A kernel panic or OOPS comes to mind first. That's very broad class of
the problem, to say the least, hence the need of kernel logs.

Xorg hang is the second possible option. AMD hardware is somewhat
problematic here.

Barring above - an overheat is the third possible scenario here.


In short, Google (or any other search engine) can offer little help here
- you have to know what to search.

Reco




I've attached the dmesg file.  The platfrom was locked up and i had to 
reboot the system to get the file.


--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1

[0.00] Linux version 4.19.0-6-amd64 (debian-ker...@lists.debian.org) 
(gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.67-2+deb10u2 
(2019-11-11)
[0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-4.19.0-6-amd64 
root=UUID=3b5d6f38-b208-431a-842c-8a44f7c26cec ro quiet
[0.00] random: get_random_u32 called from bsp_init_amd+0x20b/0x2b0 with 
crng_init=0
[0.00] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point 
registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[0.00] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[0.00] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[0.00] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, 
using 'standard' format.
[0.00] BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009e7ff] usable
[0.00] BIOS-e820: [mem 0x0009e800-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0xda580fff] usable
[0.00] BIOS-e820: [mem 0xda581000-0xda85cfff] reserved
[0.00] BIOS-e820: [mem 0xda85d000-0xda86cfff] ACPI data
[0.00] BIOS-e820: [mem 0xda86d000-0xdb956fff] ACPI NVS
[0.00] BIOS-e820: [mem 0xdb957000-0xdca33fff] reserved
[0.00] BIOS-e820: [mem 0xdca34000-0xdca34fff] usable
[0.00] BIOS-e820: [mem 0xdca35000-0xdcc3afff] ACPI NVS
[0.00] BIOS-e820: [mem 0xdcc3b000-0xdd082fff] usable
[0.00] BIOS-e820: [mem 0xdd083000-0xdd7f3fff] reserved
[0.00] BIOS-e820: [mem 0xdd7f4000-0xdd7f] usable
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xfec1-0xfec10fff] reserved
[0.00] BIOS-e820: [mem 0xfec2-0xfec20fff] reserved
[0.00] BIOS-e820: [mem 0xfed0-0xfed00fff] reserved
[0.00] BIOS-e820: [mem 0xfed61000-0xfed70fff] reserved
[0.00] BIOS-e820: [mem 0xfed8-0xfed8] reserved
[0.00] BIOS-e820: [mem 0xfef0-0x] reserved
[0.00] BIOS-e820: [mem 0x00011000-0x00021eff] usable
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.7 present.
[0.00] DMI: To be filled by O.E.M. To be filled by O.E.M./M5A97 R2.0, 
BIOS 2603 06/26/2015
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] AGP: No AGP bridge found
[0.00] last_pfn = 0x21f000 max_arch_pfn = 0x4
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B write-through
[0.00]   C-CEFFF write-protect
[0.00]   CF000-EBFFF uncachable
[0.00]   EC000-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base  mask 8000 write-back
[0.00]   1 base 8000 mask C000 write-back
[0.00]   2 base C000 mask E000 write-back
[0.00]   3 base DD80 mask FF80 uncachable
[0.00]   4 base DE00 mask FE00 uncachable
[0.00]   5 disabled
[0.00]   6 disabled
[0.00]   7 disabled
[0.00] 

Re: Orphaned Inode Problem

2019-11-18 Thread Pascal Hambourg

Le 18/11/2019 à 20:06, Stephen P. Molnar a écrit :


The CPU is an AMD FX-8320 Eight-Core Processor on an ASUSTeK M5A97 R2.0 
Motherboard with 8GB Ram. I have started having orphaned inodes when I 
run a major piece of software in my research program.


How do you know you have orphaned inodes and that they are the cause of 
the system hang ?


AFAIK, orphaned inodes are not a cause but a consequence of an uncleanly 
unmounted filesystem, and fsck spots them at the next boot. They are 
often caused by a hard reboot.




Re: Orphaned Inode Problem

2019-11-18 Thread Dan Ritter
Reco wrote: 
>   Hi.
> 
> On Mon, Nov 18, 2019 at 02:06:48PM -0500, Stephen P. Molnar wrote:
> > he problem is that the program hangs and the system will not
> > recognized the keyboard, although, according to gKrellM the system is
> > still operating. The only solution seems to be to reboot the system.
> 
> The contents of /var/log/messages at the time of the hang will
> definitely help to pinpoint the issue.
> 
> And maybe the xorg.log, but it's non-trivial to extract something useful
> from it - you have to wait for the hang, reboot, and locate
> Xorg.0.log.old file.
> 
> 
> > I have no idea what the cause may be or what a solution might be.
> > Google is no help (al least, nothing that I can understand).
> 
> A kernel panic or OOPS comes to mind first. That's very broad class of
> the problem, to say the least, hence the need of kernel logs.
> 
> Xorg hang is the second possible option. AMD hardware is somewhat
> problematic here.
> 
> Barring above - an overheat is the third possible scenario here.

It's also plausible that one or more disks are failing, or
controllers. This can be exacerbated by high temperature.

So:

look in logs
clean hardware, especially fans
add airflow
keep an SSH session open from another machine and see if you
can poke at it after a "hang".

-dsr-



Re: Orphaned Inode Problem

2019-11-18 Thread Reco
Hi.

On Mon, Nov 18, 2019 at 02:06:48PM -0500, Stephen P. Molnar wrote:
> he problem is that the program hangs and the system will not
> recognized the keyboard, although, according to gKrellM the system is
> still operating. The only solution seems to be to reboot the system.

The contents of /var/log/messages at the time of the hang will
definitely help to pinpoint the issue.

And maybe the xorg.log, but it's non-trivial to extract something useful
from it - you have to wait for the hang, reboot, and locate
Xorg.0.log.old file.


> I have no idea what the cause may be or what a solution might be.
> Google is no help (al least, nothing that I can understand).

A kernel panic or OOPS comes to mind first. That's very broad class of
the problem, to say the least, hence the need of kernel logs.

Xorg hang is the second possible option. AMD hardware is somewhat
problematic here.

Barring above - an overheat is the third possible scenario here.


In short, Google (or any other search engine) can offer little help here
- you have to know what to search.

Reco



Orphaned Inode Problem

2019-11-18 Thread Stephen P. Molnar

I am running Stretch on my Linux platform.

The CPU is an AMD FX-8320 Eight-Core Processor on an ASUSTeK M5A97 R2.0 
Motherboard with 8GB Ram. I have started having orphaned inodes when I 
run a major piece of software in my research program.


he problem is that the program hangs and the system will not recognized 
the keyboard, although, according to gKrellM the system is still 
operating. The only solution seems to be to reboot the system.


I have no idea what the cause may be or what a solution might be. Google 
is no help (al least, nothing that I can understand).


Help will be much appreciated.

Thanks in advance.

--
Stephen P. Molnar, Ph.D.
www.molecular-modeling.net
614.312.7528 (c)
Skype:  smolnar1