Hello Ted, I have tried on 4 different computers and it is reproducible on all of them on unclean shutdown.
NUC11TNKi7, NUC5i5RYB, DA-1100 and Asrock SOM-P104J SBC *NUC11TNKi7:* Manufacturer - Intel Corporation Processor - 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz <https://dashboards.onestoptoronto.com/trixie/fact/processor0/11th%2BGen%2BIntel%2528R%2529%2BCore%2528TM%2529%2Bi7-1165G7%2B%2540%2B2.80GHz> Disk - nvme Type- HDD Disk model - TS128GMTE110S *NUC5i5RYB:* Manufacturer - Intel Corporation Processor - Intel(R) Core(TM) i5-5250U CPU @ 1.60GHz <https://dashboards.onestoptoronto.com/trixie/fact/processor0/Intel%2528R%2529%2BCore%2528TM%2529%2Bi5-5250U%2BCPU%2B%2540%2B1.60GHz> Disk type - HDD Disk model - TS64GMTS800 *SOM-P104J SBC:* Manufacturer - ASRock Industrial Processor - Intel(R) N97 Disk - nvme Disk model - TS128GMTE110S *DA-1100:* Manufacturer - CINCOZE Processor - Intel(R) Pentium(R) CPU N4200 @ 1.10GHz <https://dashboards.onestoptoronto.com/fact/processor0/Intel%2528R%2529%2BPentium%2528R%2529%2BCPU%2BN4200%2B%2540%2B1.10GHz> Disk model - TS64GMSA372I It does not happen under Bookworm. I even tried upgrading kernel to 6.9.7 from bookworm-backports, e2sfprogs-1.47.0-2 on the same 4 computers and the computer boot normally after every unclean shutdown. *How to reproduce it:* Install stock trixie 1. apt-get install watchdog sway greetd chromium 2. edit /etc/watchdog.conf and uncomment "watchdog-device = /dev/watchdog" line. Save the file 3. mkdir -p ~/.conf/sway 4. cp /etc/sway/config ~/.config/sway/ 5. edit /etc/greetd/config.toml and change the following lines command = "/usr/bin/sway" to launch sway on startup user = "(yourUserName)" 6. systemctl enable watchdog 7. systemctl enable greetd 8. reboot 9. when computer starts in sway, press [cmd] + [Enter} to launch terminal 10. chromium --ozone-platform-hint=wayland 11. As root - echo c > /proc/sysrq-trigger 12. The computer boots in busybox displaying the message I sent earlier I have tried other graphical environments such as GNOME and the issue is reproducible on it as well when triggered unclean shutdown. *NUC5i5RYB:* dumpe2fs -h /dev/sda2 dumpe2fs 1.47.1 (20-May-2024) Filesystem volume name: <none> Last mounted on: / Filesystem UUID: eb3f94f2-7f1e-4405-9239-9738ea84499d Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index orphan_file filetype needs_recovery extent 64bit flex_bg metadata_csum_seed sparse_super large_file huge_file dir_nlink extra_isize metadata_csum orphan_present Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 3645440 Block count: 14572800 Reserved block count: 728640 Overhead clusters: 307682 Free blocks: 13653939 Free inodes: 3596069 First block: 0 Block size: 4096 Fragment size: 4096 Group descriptor size: 64 Reserved GDT blocks: 1024 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 Flex block group size: 16 Filesystem created: Wed Nov 6 16:29:19 2024 Last mount time: Thu Nov 7 14:23:48 2024 Last write time: Thu Nov 7 14:23:47 2024 Mount count: 2 Maximum mount count: -1 Last checked: Wed Nov 6 16:48:05 2024 Check interval: 0 (<none>) Lifetime writes: 5583 MB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 32 Desired extra isize: 32 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: 073ed4d9-bf6a-433e-ab6f-5c16c26b2c21 Journal backup: inode blocks Checksum type: crc32c Checksum: 0x193901e2 Checksum seed: 0xcc5a740b Orphan file inode: 12 Journal features: journal_incompat_revoke journal_64bit journal_checksum_v3 Total journal size: 256M Total journal blocks: 65536 Max transaction length: 65536 Fast commit length: 0 Journal sequence: 0x00004e09 Journal start: 18537 Journal checksum type: crc32c Journal checksum: 0xee80ca9e *NUC11TNKi7:* dumpe2fs 1.47.1 (20-May-2024) Filesystem volume name: <none> Last mounted on: / Filesystem UUID: aa673884-2798-4c16-8412-e6e6d6634227 Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index orphan_file filetype needs_recovery extent 64bit flex_bg metadata_csum_seed sparse_super large_file huge_file dir_nlink extra_isize metadata_csum orphan_present Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 7356416 Block count: 29394944 Reserved block count: 1469747 Overhead clusters: 608244 Free blocks: 28166565 Free inodes: 7306375 First block: 0 Block size: 4096 Fragment size: 4096 Group descriptor size: 64 Reserved GDT blocks: 1024 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 Flex block group size: 16 Filesystem created: Thu Nov 7 14:23:49 2024 Last mount time: Thu Nov 7 15:12:06 2024 Last write time: Thu Nov 7 15:12:05 2024 Mount count: 1 Maximum mount count: -1 Last checked: Thu Nov 7 15:12:05 2024 Check interval: 0 (<none>) Lifetime writes: 6552 MB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 32 Desired extra isize: 32 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: d0fc1889-cf0a-45ee-8988-ed1167f88490 Journal backup: inode blocks Checksum type: crc32c Checksum: 0xd0b259cc Checksum seed: 0x4edf1204 Orphan file inode: 12 Journal features: journal_incompat_revoke journal_64bit journal_checksum_v3 Total journal size: 512M Total journal blocks: 131072 Max transaction length: 131072 Fast commit length: 0 Journal sequence: 0x00004f54 Journal start: 86508 Journal checksum type: crc32c Journal checksum: 0x6cd94711 *Under Trixie:* I have downgraded linux-image to the following: linux-image-6.9.9 linux-image-6.8.12 linux-image-6.6.15 Each time the computer boots in busybox on unclean shutdown. I am unable to find the 1.47.1~rc-1 version so can't confirm. I have tested 1.47.1~rc1-1, 1.47.1~rc1-2, 1.47.1~rc1-3, 1.47.1-1 and the issue is reproducible on all these versions. Please let me know if you need more info. Thank you! On Wed, Nov 6, 2024 at 5:17 PM Theodore Ts'o <ty...@mit.edu> wrote: > On Wed, Nov 06, 2024 at 01:23:01PM -0500, Anees Ahmad wrote: > > Hello, > > > > The issue described initially does not happen when both packages > > *e2fsprogs_1.47.0-2.4_amd64.deb* > > < > https://snapshot.debian.org/archive/debian/20240314T094714Z/pool/main/e/e2fsprogs/e2fsprogs_1.47.0-2.4_amd64.deb > > > > and *libext2fs2t64_1.47.0-2.4_amd64.deb* > > < > https://snapshot.debian.org/archive/debian/20240314T094714Z/pool/main/e/e2fsprogs/libext2fs2t64_1.47.0-2.4_amd64.deb > > > > were downgraded to 1.47.0-2.4 on the same computer. > > The bug appeared in 1.47.1~rc1-1. > > > How reproducible is the problem with the 1.47.1~rc-1? > > And can you describe the hardware where this is happening --- what > kind of storage device, etc.? And can you reroduce it on some other > hardware? And can you send the outut of dumpe2fs -h /dev/sda2? > > This is not a problem that I've seen on any of my hardware, or on my > regression test suites, so without a clean reproducer it's going to be > very hard to work the problem. > > Also, note that the error messages which you reported: > > /dev/sda2: recovering journal > /dev/sda2: Clearing orphaned inode 3408176 (uid=1000, gid=1000, > mode-0100600, size=64) > /dev/sda2: clean, 150916/3645440 files, 1460364/14572800 blocks > [ 3.7641111 EXT4-fs error (device sda2): ext4_orphan_get:1421: comm > mount: bad orphan inode 3408176 > [ 3.7641371 ext4_test_bit (bit-303, block-13631504) = 0 > EXT4-fs error (device sda2): ext4 orphan_get:1121: comm mount: bad orphan > inode 3408176 > ext1_test_bit(bit-303, block-13631504) = 0 > [ 3.7646551 EXT4-fs error (device sda2): > ext4_mark_recovery_complete:6229: comm mount: Orphan file not empty on > read-only fs. 3.764746) EXT4-fs (sda2): mount failed > mount: mounting /dev/sda2 on /root failed: Structure needs cleaning > Failed to mount /dev/sda2 as root file system. > EXT4-fs error (device sda2): ext4_mark_recovery_complete:6229: comm mount: > Orphan file not empty on read-only fs. EXT4-fs (sda2): mount failed > BusyBox v1.37.0 (Debian 1:1.37.0-4) built-in shell (ash) > > > Are emitted by the kernel, and happen before any program in e2fsprogs > has a chance to run. Hence, it is highly unlikely that the version of > e2fsprogs would make a difference in this message. > > The specific error message indicates that there is an inode on the > orphan list (in this case, inode #3408176) which is marked as unused > (i.e., free) in the block alloation bitmap. This is a file system > corruption problem, and is generally indicative of a kernel bug, or > some kind of hardware problem/failure/bug. > > The reason why I'm a bit dubious that it is a kernel bug is (a) no one > else has reported anything like this, and (b) I am regularly running > kernel regression test which tests the unclean shutdown code path, and > I haven't seen a failure in this area of the kernel code for years. > > Cheers, > > - Ted >