Re: read time tree block corruption detected
On 2021/4/17 上午3:35, Gervais, Francois wrote: We are using btrfs on one of our embedded devices and we got filesystem corruption on one of them. This product undergo a lot of tests on our side and apparently it's the first it happened so it seems to be a pretty rare occurrence. However we still want to get to the bottom of this to ensure it doesn't happen in the future. Some background: - The corruption happened on kernel v5.4.72. - On the debug device I'm on master (v5.12.0-rc7) hoping it might help to have all the latest patches. Here what kernel v5.12.0-rc7 tells me when trying to mount the partition: Apr 16 19:31:45 buildroot kernel: BTRFS info (device loop0p3): disk space caching is enabled Apr 16 19:31:45 buildroot kernel: BTRFS info (device loop0p3): has skinny extents Apr 16 19:31:45 buildroot kernel: BTRFS info (device loop0p3): start tree-log replay Apr 16 19:31:45 buildroot kernel: BTRFS critical (device loop0p3): corrupt leaf: root=18446744073709551610 block=790151168 slot=5 ino=5007, inode ref overflow, ptr 15853 end 15861 namelen 294 Please provide the following dump: #btrfs ins dump-tree -b 18446744073709551610 /dev/loop0p3 I'm wondering why write-time tree-check didn't catch it. Thanks, Qu Apr 16 19:31:45 buildroot kernel: BTRFS error (device loop0p3): block=790151168 read time tree block corruption detected Apr 16 19:31:45 buildroot kernel: BTRFS critical (device loop0p3): corrupt leaf: root=18446744073709551610 block=790151168 slot=5 ino=5007, inode ref overflow, ptr 15853 end 15861 namelen 294 Apr 16 19:31:45 buildroot kernel: BTRFS error (device loop0p3): block=790151168 read time tree block corruption detected Apr 16 19:31:45 buildroot kernel: BTRFS: error (device loop0p3) in btrfs_recover_log_trees:6246: errno=-5 IO failure (Couldn't read tree log root.) Apr 16 19:31:45 buildroot kernel: BTRFS: error (device loop0p3) in btrfs_replay_log:2341: errno=-5 IO failure (Failed to recover log tree) Apr 16 19:31:45 buildroot e512c123daaa[468]: mount: /root/mnt: can't read superblock on /dev/loop0p3. Apr 16 19:31:45 buildroot kernel: BTRFS error (device loop0p3): open_ctree failed: -5 Any suggestions?
Re: read time tree block corruption detected
> Please provide the following dump: > #btrfs ins dump-tree -b 18446744073709551610 /dev/loop0p3 > > I'm wondering why write-time tree-check didn't catch it. > > Thanks, > Qu I get: root@debug:~# btrfs ins dump-tree -b 18446744073709551610 /dev/loop0p3 btrfs-progs v5.7 ERROR: tree block bytenr 18446744073709551610 is not aligned to sectorsize 4096 We have an unusual partition table due to an hardware (cpu) requirement. This might be the source of this error? Disk /dev/loop0: 40763392 sectors, 19.4 GiB Sector size (logical/physical): 512/512 bytes Disk identifier (GUID): A18E4543-634A-4E8C-B55D-DA1E217C4D98 Partition table holds up to 24 entries Main partition table begins at sector 2 and ends at sector 7 First usable sector is 8, last usable sector is 40763384 Partitions will be aligned on 8-sector boundaries Total free space is 0 sectors (0 bytes) Number Start (sector)End (sector) Size Code Name 1 8 32775 16.0 MiB8300 2 32776 237575 100.0 MiB 8300 3 23757640763384 19.3 GiB8300
Re: read time tree block corruption detected
On 2021/4/19 下午9:20, Gervais, Francois wrote: Please provide the following dump: #btrfs ins dump-tree -b 18446744073709551610 /dev/loop0p3 I'm wondering why write-time tree-check didn't catch it. Thanks, Qu I get: root@debug:~# btrfs ins dump-tree -b 18446744073709551610 /dev/loop0p3 btrfs-progs v5.7 ERROR: tree block bytenr 18446744073709551610 is not aligned to sectorsize 4096 My bad, wrong number. The correct number command is: # btrfs ins dump-tree -b 790151168 /dev/loop0p3 Thanks, Qu We have an unusual partition table due to an hardware (cpu) requirement. This might be the source of this error? Disk /dev/loop0: 40763392 sectors, 19.4 GiB Sector size (logical/physical): 512/512 bytes Disk identifier (GUID): A18E4543-634A-4E8C-B55D-DA1E217C4D98 Partition table holds up to 24 entries Main partition table begins at sector 2 and ends at sector 7 First usable sector is 8, last usable sector is 40763384 Partitions will be aligned on 8-sector boundaries Total free space is 0 sectors (0 bytes) Number Start (sector)End (sector) Size Code Name 1 8 32775 16.0 MiB8300 2 32776 237575 100.0 MiB 8300 3 23757640763384 19.3 GiB8300
Re: read time tree block corruption detected
> My bad, wrong number. > > The correct number command is: > # btrfs ins dump-tree -b 790151168 /dev/loop0p3 root@debug:~# btrfs ins dump-tree -b 790151168 /dev/loop0p3 btrfs-progs v5.7 leaf 790151168 items 10 free space 15237 generation 219603 owner TREE_LOG leaf 790151168 flags 0x1(WRITTEN) backref revision 1 fs uuid 29d53427-f943-43ad-a99e-ac695d225d0b chunk uuid 04c4bf25-55ac-487e-97a3-fbdc84961b4a item 0 key (4614 INODE_ITEM 0) itemoff 16123 itemsize 160 generation 282 transid 219603 size 0 nbytes 0 block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0 sequence 1345948 flags 0x0(none) atime 1610373764.218465480 (2021-01-11 14:02:44) ctime 1617477830.389953334 (2021-04-03 19:23:50) mtime 1617477830.389953334 (2021-04-03 19:23:50) otime 606208.1 (1970-01-08 00:23:28) item 1 key (4614 INODE_REF 1020) itemoff 16101 itemsize 22 index 1217 namelen 12 name: brokerStatus item 2 key (4996 INODE_ITEM 0) itemoff 15941 itemsize 160 generation 290 transid 219603 size 0 nbytes 0 block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0 sequence 4801736 flags 0x0(none) atime 1617304887.496533028 (2021-04-01 19:21:27) ctime 1617477830.681955095 (2021-04-03 19:23:50) mtime 1617477830.681955095 (2021-04-03 19:23:50) otime 0.0 (1970-01-01 00:00:00) item 3 key (4996 INODE_REF 4715) itemoff 15920 itemsize 21 index 9 namelen 11 name: scodes.conf item 4 key (5007 INODE_ITEM 0) itemoff 15760 itemsize 160 generation 294 transid 219603 size 0 nbytes 18446462598731726987 block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0 sequence 476091 flags 0x0(none) atime 1610373772.750632843 (2021-01-11 14:02:52) ctime 1617477826.205928110 (2021-04-03 19:23:46) mtime 1617477826.205928110 (2021-04-03 19:23:46) otime 0.0 (1970-01-01 00:00:00) item 5 key (5007 INODE_REF 4727) itemoff 15732 itemsize 28 index 0 namelen 0 name: index 0 namelen 0 name: index 0 namelen 294 name: item 6 key (5041 INODE_ITEM 0) itemoff 15572 itemsize 160 generation 295 transid 219603 size 4096 nbytes 4096 block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0 sequence 321954 flags 0x0(none) atime 1610373832.763235044 (2021-01-11 14:03:52) ctime 1617477815.541863825 (2021-04-03 19:23:35) mtime 1617477815.541863825 (2021-04-03 19:23:35) otime 0.0 (1970-01-01 00:00:00) item 7 key (5041 INODE_REF 4727) itemoff 15544 itemsize 28 index 12 namelen 18 name: health_metrics.txt item 8 key (5041 EXTENT_DATA 0) itemoff 15491 itemsize 53 generation 219603 type 1 (regular) extent data disk byte 12746752 nr 4096 extent data offset 0 nr 4096 ram 4096 extent compression 0 (none) item 9 key (EXTENT_CSUM EXTENT_CSUM 12746752) itemoff 15487 itemsize 4 range start 12746752 end 12750848 length 4096
Re: read time tree block corruption detected
On 2021/4/19 下午10:56, Gervais, Francois wrote: My bad, wrong number. The correct number command is: # btrfs ins dump-tree -b 790151168 /dev/loop0p3 root@debug:~# btrfs ins dump-tree -b 790151168 /dev/loop0p3 btrfs-progs v5.7 [...] item 4 key (5007 INODE_ITEM 0) itemoff 15760 itemsize 160 generation 294 transid 219603 size 0 nbytes 18446462598731726987 The nbytes looks very strange. It's 0x0xfffeffef008b, which definitely looks aweful for an empty inode. block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0 sequence 476091 flags 0x0(none) atime 1610373772.750632843 (2021-01-11 14:02:52) ctime 1617477826.205928110 (2021-04-03 19:23:46) mtime 1617477826.205928110 (2021-04-03 19:23:46) otime 0.0 (1970-01-01 00:00:00) item 5 key (5007 INODE_REF 4727) itemoff 15732 itemsize 28 index 0 namelen 0 name: index 0 namelen 0 name: index 0 namelen 294 name: Definitely corrupted. I'm afraid tree-checker is correct. The log tree is corrupted. And the check to detect such corrupted inode ref is only introduced in v5.5 kernel, no wonder v5.4 kernel didn't catch it at runtime. I don't have any idea why this could happen, as it doesn't look like an obvious memory flip. Maybe Filipe could have some clue on this? Thanks, Qu item 6 key (5041 INODE_ITEM 0) itemoff 15572 itemsize 160 generation 295 transid 219603 size 4096 nbytes 4096 block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0 sequence 321954 flags 0x0(none) atime 1610373832.763235044 (2021-01-11 14:03:52) ctime 1617477815.541863825 (2021-04-03 19:23:35) mtime 1617477815.541863825 (2021-04-03 19:23:35) otime 0.0 (1970-01-01 00:00:00) item 7 key (5041 INODE_REF 4727) itemoff 15544 itemsize 28 index 12 namelen 18 name: health_metrics.txt item 8 key (5041 EXTENT_DATA 0) itemoff 15491 itemsize 53 generation 219603 type 1 (regular) extent data disk byte 12746752 nr 4096 extent data offset 0 nr 4096 ram 4096 extent compression 0 (none) item 9 key (EXTENT_CSUM EXTENT_CSUM 12746752) itemoff 15487 itemsize 4 range start 12746752 end 12750848 length 4096
Re: read time tree block corruption detected
> On 2021/4/19 下午10:56, Gervais, Francois wrote: >>> My bad, wrong number. >>> >>> The correct number command is: >>> # btrfs ins dump-tree -b 790151168 /dev/loop0p3 >> >> >> root@debug:~# btrfs ins dump-tree -b 790151168 /dev/loop0p3 >> btrfs-progs v5.7 > [...] >> item 4 key (5007 INODE_ITEM 0) itemoff 15760 itemsize 160 >> generation 294 transid 219603 size 0 nbytes >>18446462598731726987 > > The nbytes looks very strange. > > It's 0x0xfffeffef008b, which definitely looks aweful for an empty inode. > >> block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0 >> sequence 476091 flags 0x0(none) >> atime 1610373772.750632843 (2021-01-11 14:02:52) >> ctime 1617477826.205928110 (2021-04-03 19:23:46) >> mtime 1617477826.205928110 (2021-04-03 19:23:46) >> otime 0.0 (1970-01-01 00:00:00) >> item 5 key (5007 INODE_REF 4727) itemoff 15732 itemsize 28 >> index 0 namelen 0 name: >> index 0 namelen 0 name: >> index 0 namelen 294 name: > > Definitely corrupted. I'm afraid tree-checker is correct. > > The log tree is corrupted. > And the check to detect such corrupted inode ref is only introduced in > v5.5 kernel, no wonder v5.4 kernel didn't catch it at runtime. Would detecting it at runtime with a newer kernel have helped in any way with the corruption? > > I don't have any idea why this could happen, as it doesn't look like an > obvious memory flip. The test engineer says that the last thing he did was remove power from the device. Could power loss be the cause of this issue? > > Maybe Filipe could have some clue on this? > > Thanks, > Qu > >> item 6 key (5041 INODE_ITEM 0) itemoff 15572 itemsize 160 >> generation 295 transid 219603 size 4096 nbytes 4096 >> block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0 >> sequence 321954 flags 0x0(none) >> atime 1610373832.763235044 (2021-01-11 14:03:52) >> ctime 1617477815.541863825 (2021-04-03 19:23:35) >> mtime 1617477815.541863825 (2021-04-03 19:23:35) >> otime 0.0 (1970-01-01 00:00:00) >> item 7 key (5041 INODE_REF 4727) itemoff 15544 itemsize 28 >> index 12 namelen 18 name: health_metrics.txt >> item 8 key (5041 EXTENT_DATA 0) itemoff 15491 itemsize 53 >> generation 219603 type 1 (regular) >> extent data disk byte 12746752 nr 4096 >> extent data offset 0 nr 4096 ram 4096 >> extent compression 0 (none) >> item 9 key (EXTENT_CSUM EXTENT_CSUM 12746752) itemoff 15487 itemsize 4 >> range start 12746752 end 12750848 length 4096 >>
Re: read time tree block corruption detected
On 2021/4/20 下午10:19, Gervais, Francois wrote: On 2021/4/19 下午10:56, Gervais, Francois wrote: My bad, wrong number. The correct number command is: # btrfs ins dump-tree -b 790151168 /dev/loop0p3 root@debug:~# btrfs ins dump-tree -b 790151168 /dev/loop0p3 btrfs-progs v5.7 [...] item 4 key (5007 INODE_ITEM 0) itemoff 15760 itemsize 160 generation 294 transid 219603 size 0 nbytes 18446462598731726987 The nbytes looks very strange. It's 0x0xfffeffef008b, which definitely looks aweful for an empty inode. block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0 sequence 476091 flags 0x0(none) atime 1610373772.750632843 (2021-01-11 14:02:52) ctime 1617477826.205928110 (2021-04-03 19:23:46) mtime 1617477826.205928110 (2021-04-03 19:23:46) otime 0.0 (1970-01-01 00:00:00) item 5 key (5007 INODE_REF 4727) itemoff 15732 itemsize 28 index 0 namelen 0 name: index 0 namelen 0 name: index 0 namelen 294 name: Definitely corrupted. I'm afraid tree-checker is correct. The log tree is corrupted. And the check to detect such corrupted inode ref is only introduced in v5.5 kernel, no wonder v5.4 kernel didn't catch it at runtime. Would detecting it at runtime with a newer kernel have helped in any way with the corruption? Yes, newer kernel will reject the write, so such damaged metadata won't reach disk. But that's just more graceful than corrupted fs. It will still cause error like transaction aborted. I don't have any idea why this could happen, as it doesn't look like an obvious memory flip. The test engineer says that the last thing he did was remove power from the device. Could power loss be the cause of this issue? It shouldn't. The log tree can only be exposed by power loss, but it's not designed to have such corrupted data on-disk. This normally means some code is wrong when generating log tree. Thanks, Qu Maybe Filipe could have some clue on this? Thanks, Qu item 6 key (5041 INODE_ITEM 0) itemoff 15572 itemsize 160 generation 295 transid 219603 size 4096 nbytes 4096 block group 0 mode 100600 links 1 uid 1000 gid 1000 rdev 0 sequence 321954 flags 0x0(none) atime 1610373832.763235044 (2021-01-11 14:03:52) ctime 1617477815.541863825 (2021-04-03 19:23:35) mtime 1617477815.541863825 (2021-04-03 19:23:35) otime 0.0 (1970-01-01 00:00:00) item 7 key (5041 INODE_REF 4727) itemoff 15544 itemsize 28 index 12 namelen 18 name: health_metrics.txt item 8 key (5041 EXTENT_DATA 0) itemoff 15491 itemsize 53 generation 219603 type 1 (regular) extent data disk byte 12746752 nr 4096 extent data offset 0 nr 4096 ram 4096 extent compression 0 (none) item 9 key (EXTENT_CSUM EXTENT_CSUM 12746752) itemoff 15487 itemsize 4 range start 12746752 end 12750848 length 4096 >