[ Sorry for not getting back to your earlier ] Theodore Ts'o wrote...
> What could have happened on your file system? Well, there are two
> scenarios that could explain what had happened:
After some more desastrous experiences (kernel stack traces and
segfaults) I assume serious hardware issues. All this happened only if
there was either a PCI card plugged in (ethernet, USB controller, SATA
controller) or a external drive via firewire. Either is mainboard has a
flaw, or the PCI/DMA(?) support in the kernel went out of shape without
people noticing. That machine is from 2003-ish and that type isn't much
in use any longer. For example, networking in newer kernels has issues
under a specific load, which is why I'm still stuck on 4.19. Should
bisect that some day.
Anyway, it's easy to assume the data in a write request to the block
device was garbled, and things went downhill from there.
> 1) Somehow the inode was corrupted to (a) both set the inline data
> flag, and (b) a valid extended attribute that had "system.data" (which
> can't be set via the userspace API; it would have had to been
> magically, random set). Highly unlikely.
>
> 2) There was a random bit flip that enabled the inline_data feature
> flag in the superblock. The other fscrypt kernel message would
> be explained another random bit flip and/or random garbage written
> into an inode table block.
Makes a lot of sense, then.
> 3) An admin accidentally ran "tune2fs -O inline_data /dev/sdXX" to
> enable the inline_data feature. The fscrypt message could be
> explained as above.
Not in this case, I'm the only person who has access.
> In any case, e2fsck is doing the right thing. It *is* possible for
> e2fsck to set the inline_data feature flag, yes..... but it's under
> very tightly constrained circumstances, and the alternative would be
> to have e2fsck to delete user data that could potentially be quite
> valuable. (For example, a cryptographic key which protects a bitcoin
> wallet with $220 million dollars worth of bitcoin in it. :-P )
In theory, e2fsck could emit a warning "That filesystem is ext2, the
change I'd like to perform next would require ext4 support". But that
would possibly get lost among all the other warnings or/or the bugged
admin already hit "Yes to all questions" (I did) and it goes unnoticed
as well. It might be possible to handle such a situation but honestly
it's not happening often enough that it's worth you spend a lot of time
on it. Old lesson learned again: If a fsck reports more than just a few
glitches, it might be wiser to clean the filesystem entirely and restore
the data from the backup.
> Could this happen when it shouldn't? Well, it would highly unlikely
> --- as in one in bazillions odds unlucky. It's actually much more
> likely that a random bitflip in the in-memory superblock toggled the
> inline_data feature bit set.
Yeah, I guess we can as well close this bug. thanks a lot for your time
and the explanations given.
Christoph
signature.asc
Description: PGP signature

