[ Sorry for not getting back to your earlier ]

Theodore Ts'o wrote...

> What could have happened on your file system?  Well, there are two
> scenarios that could explain what had happened:

After some more desastrous experiences (kernel stack traces and
segfaults) I assume serious hardware issues. All this happened only if
there was either a PCI card plugged in (ethernet, USB controller, SATA
controller) or a external drive via firewire. Either is mainboard has a
flaw, or the PCI/DMA(?) support in the kernel went out of shape without
people noticing. That machine is from 2003-ish and that type isn't much
in use any longer. For example, networking in newer kernels has issues
under a specific load, which is why I'm still stuck on 4.19. Should
bisect that some day.

Anyway, it's easy to assume the data in a write request to the block
device was garbled, and things went downhill from there.

> 1) Somehow the inode was corrupted to (a) both set the inline data
> flag, and (b) a valid extended attribute that had "system.data" (which
> can't be set via the userspace API; it would have had to been
> magically, random set).   Highly unlikely.
> 
> 2) There was a random bit flip that enabled the inline_data feature
> flag in the superblock.  The other fscrypt kernel message would
> be explained another random bit flip and/or random garbage written
> into an inode table block.

Makes a lot of sense, then.

> 3) An admin accidentally ran "tune2fs -O inline_data /dev/sdXX" to
> enable the inline_data feature.  The fscrypt message could be
> explained as above.

Not in this case, I'm the only person who has access.

> In any case, e2fsck is doing the right thing.  It *is* possible for
> e2fsck to set the inline_data feature flag, yes..... but it's under
> very tightly constrained circumstances, and the alternative would be
> to have e2fsck to delete user data that could potentially be quite
> valuable.  (For example, a cryptographic key which protects a bitcoin
> wallet with $220 million dollars worth of bitcoin in it.  :-P )

In theory, e2fsck could emit a warning "That filesystem is ext2, the
change I'd like to perform next would require ext4 support". But that
would possibly get lost among all the other warnings or/or the bugged
admin already hit "Yes to all questions" (I did) and it goes unnoticed
as well. It might be possible to handle such a situation but honestly
it's not happening often enough that it's worth you spend a lot of time
on it. Old lesson learned again: If a fsck reports more than just a few
glitches, it might be wiser to clean the filesystem entirely and restore
the data from the backup.

> Could this happen when it shouldn't?  Well, it would highly unlikely
> --- as in one in bazillions odds unlucky.  It's actually much more
> likely that a random bitflip in the in-memory superblock toggled the
> inline_data feature bit set.

Yeah, I guess we can as well close this bug. thanks a lot for your time
and the explanations given.

    Christoph

Attachment: signature.asc
Description: PGP signature

Reply via email to