Re: FS unmountable after RAID/LVM problems
On 2018年03月14日 18:06, Dirk Gouders wrote: > Qu Wenruowrites: > >> On 2018年03月14日 17:36, Dirk Gouders wrote: >>> Qu Wenruo writes: >>> On 2018年03月14日 16:53, Dirk Gouders wrote: > Qu Wenruo writes: > >> On 2018年03月13日 22:49, Dirk Gouders wrote: >> [snip] # btrfs inspect dump-tree -b 848986112 /dev/loop0p1 # btrfs inspect dump-tree -b 72089600 /dev/loop0p1 >>> >>> OK. >>> >>> (This mail gets a bit long but I don't want to snip probably important >>> information above.) >>> >> >> Feel free to snip. >> As the involved tree block is not shown anywhere. >> >> So it's not any root node corrupted. >> It may be some extent tree node corrupted in this case. >> >> While to inspect it, we need some new functionality in btrfs inspect >> tree. >> >> Before that, would you please try the following patch and to see if it >> helps btrfs-restore to salvage any data? > > I tried it and got the following output: > > # btrfs restore /dev/loop0p1 /mnt/ > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found DC09290B wanted C630FD61 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > bytenr mismatch, want=363069440, have=17552567724568668829 > Could not open root, trying backup super > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found DC09290B wanted C630FD61 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > bytenr mismatch, want=363069440, have=17552567724568668829 > Could not open root, trying backup super > ERROR: superblock bytenr 274877906944 is larger than device size > 10741612544 > Could not open root, trying backup super So it's still something important in the tree. Would you please apply this patch? https://patchwork.kernel.org/patch/10281329/ And then dump the tree again using that newly added -f option? (Both stdout and stderr is needed) The dump command would be: # btrfs inspect dump-tree -f -b Needed bytenrs would be: 848773120 tree root 848789504 extent root (My primary guess) >>> >>> I am currently preparing the diagnosis data but after the above bytenr >>> the log grew to already 28MB. Should I send all that data to the list? >> >> Nope, stderr is enough. > > OK, I will attach the output to the end. The output is separated by the > command lines, so searching for "inspect" helps for navigation. > > For extend root and fs root I provide only stderr, because they grew > stdout by 28 resp. 150 MB. Thanks for all your effort! It's clear the problem is the extent tree. I'll try to enhance open_ctree() to allow btrfs-restore to continue even if extent tree is corrupted asap. Thanks, Qu > > Thanks, > > Dirk > >> >>> >>> Thanks, >>> >>> Dirk >>> 30408704 dev root 850509824 fs root (this could contain *FILENAME*, please censor them if needed, and it may be large) 212353024 uuid tree (not really imporatant) And if it's extent root, we could enhance btrfs-progs open_ctree() to handle it for RO mode (needed by btrfs-restore) Thanks, Qu > signature.asc Description: OpenPGP digital signature
Re: FS unmountable after RAID/LVM problems
On 2018年03月14日 17:36, Dirk Gouders wrote: > Qu Wenruowrites: > >> On 2018年03月14日 16:53, Dirk Gouders wrote: >>> Qu Wenruo writes: >>> On 2018年03月13日 22:49, Dirk Gouders wrote: [snip] >> >> # btrfs inspect dump-tree -b 848986112 /dev/loop0p1 >> # btrfs inspect dump-tree -b 72089600 /dev/loop0p1 > > OK. > > (This mail gets a bit long but I don't want to snip probably important > information above.) > Feel free to snip. As the involved tree block is not shown anywhere. So it's not any root node corrupted. It may be some extent tree node corrupted in this case. While to inspect it, we need some new functionality in btrfs inspect tree. Before that, would you please try the following patch and to see if it helps btrfs-restore to salvage any data? >>> >>> I tried it and got the following output: >>> >>> # btrfs restore /dev/loop0p1 /mnt/ >>> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >>> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >>> checksum verify failed on 363069440 found DC09290B wanted C630FD61 >>> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >>> bytenr mismatch, want=363069440, have=17552567724568668829 >>> Could not open root, trying backup super >>> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >>> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >>> checksum verify failed on 363069440 found DC09290B wanted C630FD61 >>> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >>> bytenr mismatch, want=363069440, have=17552567724568668829 >>> Could not open root, trying backup super >>> ERROR: superblock bytenr 274877906944 is larger than device size 10741612544 >>> Could not open root, trying backup super >> >> So it's still something important in the tree. >> >> Would you please apply this patch? >> https://patchwork.kernel.org/patch/10281329/ >> >> And then dump the tree again using that newly added -f option? >> (Both stdout and stderr is needed) >> >> The dump command would be: >> # btrfs inspect dump-tree -f -b >> >> Needed bytenrs would be: >> 848773120tree root >> 848789504extent root (My primary guess) > > I am currently preparing the diagnosis data but after the above bytenr > the log grew to already 28MB. Should I send all that data to the list? Nope, stderr is enough. Thanks, Qu > > Thanks, > > Dirk > >> 30408704 dev root >> 850509824fs root (this could contain *FILENAME*, please censor >> them if needed, and it may be large) >> 212353024uuid tree (not really imporatant) >> >> And if it's extent root, we could enhance btrfs-progs open_ctree() to >> handle it for RO mode (needed by btrfs-restore) >> >> Thanks, >> Qu signature.asc Description: OpenPGP digital signature
Re: FS unmountable after RAID/LVM problems
Qu Wenruowrites: > On 2018年03月14日 16:53, Dirk Gouders wrote: >> Qu Wenruo writes: >> >>> On 2018年03月13日 22:49, Dirk Gouders wrote: >>> [snip] > > # btrfs inspect dump-tree -b 848986112 /dev/loop0p1 > # btrfs inspect dump-tree -b 72089600 /dev/loop0p1 OK. (This mail gets a bit long but I don't want to snip probably important information above.) >>> >>> Feel free to snip. >>> As the involved tree block is not shown anywhere. >>> >>> So it's not any root node corrupted. >>> It may be some extent tree node corrupted in this case. >>> >>> While to inspect it, we need some new functionality in btrfs inspect tree. >>> >>> Before that, would you please try the following patch and to see if it >>> helps btrfs-restore to salvage any data? >> >> I tried it and got the following output: >> >> # btrfs restore /dev/loop0p1 /mnt/ >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> checksum verify failed on 363069440 found DC09290B wanted C630FD61 >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> bytenr mismatch, want=363069440, have=17552567724568668829 >> Could not open root, trying backup super >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> checksum verify failed on 363069440 found DC09290B wanted C630FD61 >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> bytenr mismatch, want=363069440, have=17552567724568668829 >> Could not open root, trying backup super >> ERROR: superblock bytenr 274877906944 is larger than device size 10741612544 >> Could not open root, trying backup super > > So it's still something important in the tree. > > Would you please apply this patch? > https://patchwork.kernel.org/patch/10281329/ > > And then dump the tree again using that newly added -f option? > (Both stdout and stderr is needed) > > The dump command would be: > # btrfs inspect dump-tree -f -b > > Needed bytenrs would be: > 848773120 tree root > 848789504 extent root (My primary guess) I am currently preparing the diagnosis data but after the above bytenr the log grew to already 28MB. Should I send all that data to the list? Thanks, Dirk > 30408704 dev root > 850509824 fs root (this could contain *FILENAME*, please censor > them if needed, and it may be large) > 212353024 uuid tree (not really imporatant) > > And if it's extent root, we could enhance btrfs-progs open_ctree() to > handle it for RO mode (needed by btrfs-restore) > > Thanks, > Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FS unmountable after RAID/LVM problems
On 2018年03月14日 16:53, Dirk Gouders wrote: > Qu Wenruowrites: > >> On 2018年03月13日 22:49, Dirk Gouders wrote: >> [snip] # btrfs inspect dump-tree -b 848986112 /dev/loop0p1 # btrfs inspect dump-tree -b 72089600 /dev/loop0p1 >>> >>> OK. >>> >>> (This mail gets a bit long but I don't want to snip probably important >>> information above.) >>> >> >> Feel free to snip. >> As the involved tree block is not shown anywhere. >> >> So it's not any root node corrupted. >> It may be some extent tree node corrupted in this case. >> >> While to inspect it, we need some new functionality in btrfs inspect tree. >> >> Before that, would you please try the following patch and to see if it >> helps btrfs-restore to salvage any data? > > I tried it and got the following output: > > # btrfs restore /dev/loop0p1 /mnt/ > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found DC09290B wanted C630FD61 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > bytenr mismatch, want=363069440, have=17552567724568668829 > Could not open root, trying backup super > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found DC09290B wanted C630FD61 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > bytenr mismatch, want=363069440, have=17552567724568668829 > Could not open root, trying backup super > ERROR: superblock bytenr 274877906944 is larger than device size 10741612544 > Could not open root, trying backup super So it's still something important in the tree. Would you please apply this patch? https://patchwork.kernel.org/patch/10281329/ And then dump the tree again using that newly added -f option? (Both stdout and stderr is needed) The dump command would be: # btrfs inspect dump-tree -f -b Needed bytenrs would be: 848773120 tree root 848789504 extent root (My primary guess) 30408704dev root 850509824 fs root (this could contain *FILENAME*, please censor them if needed, and it may be large) 212353024 uuid tree (not really imporatant) And if it's extent root, we could enhance btrfs-progs open_ctree() to handle it for RO mode (needed by btrfs-restore) Thanks, Qu > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > signature.asc Description: OpenPGP digital signature
Re: FS unmountable after RAID/LVM problems
Qu Wenruowrites: > On 2018年03月13日 22:49, Dirk Gouders wrote: > [snip] >>> >>> # btrfs inspect dump-tree -b 848986112 /dev/loop0p1 >>> # btrfs inspect dump-tree -b 72089600 /dev/loop0p1 >> >> OK. >> >> (This mail gets a bit long but I don't want to snip probably important >> information above.) >> > > Feel free to snip. > As the involved tree block is not shown anywhere. > > So it's not any root node corrupted. > It may be some extent tree node corrupted in this case. > > While to inspect it, we need some new functionality in btrfs inspect tree. > > Before that, would you please try the following patch and to see if it > helps btrfs-restore to salvage any data? I tried it and got the following output: # btrfs restore /dev/loop0p1 /mnt/ checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D checksum verify failed on 363069440 found DC09290B wanted C630FD61 checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D bytenr mismatch, want=363069440, have=17552567724568668829 Could not open root, trying backup super checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D checksum verify failed on 363069440 found DC09290B wanted C630FD61 checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D bytenr mismatch, want=363069440, have=17552567724568668829 Could not open root, trying backup super ERROR: superblock bytenr 274877906944 is larger than device size 10741612544 Could not open root, trying backup super Because the patch did not apply, I did the modification manually as follows: # git diff diff --git a/cmds-restore.c b/cmds-restore.c index ade35f0f..e7b96a67 100644 --- a/cmds-restore.c +++ b/cmds-restore.c @@ -1282,7 +1282,7 @@ static struct btrfs_root *open_fs(const char *dev, u64 root_location, for (i = super_mirror; i < BTRFS_SUPER_MIRROR_MAX; i++) { bytenr = btrfs_sb_offset(i); fs_info = open_ctree_fs_info(dev, bytenr, root_location, 0, -OPEN_CTREE_PARTIAL); +OPEN_CTREE_PARTIAL | __OPEN_CTREE_RETURN_CHUNK_ROOT); if (fs_info) break; fprintf(stderr, "Could not open root, trying backup super\n"); Thanks, Dirk -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FS unmountable after RAID/LVM problems
On 2018年03月13日 22:49, Dirk Gouders wrote: [snip] >> >> # btrfs inspect dump-tree -b 848986112 /dev/loop0p1 >> # btrfs inspect dump-tree -b 72089600 /dev/loop0p1 > > OK. > > (This mail gets a bit long but I don't want to snip probably important > information above.) > Feel free to snip. As the involved tree block is not shown anywhere. So it's not any root node corrupted. It may be some extent tree node corrupted in this case. While to inspect it, we need some new functionality in btrfs inspect tree. Before that, would you please try the following patch and to see if it helps btrfs-restore to salvage any data? -- diff --git a/cmds-restore.c b/cmds-restore.c index ade35f0f880f..a90379a1c7e8 100644 --- a/cmds-restore.c +++ b/cmds-restore.c @@ -1282,7 +1282,7 @@ static struct btrfs_root *open_fs(const char *dev, u64 root_location, for (i = super_mirror; i < BTRFS_SUPER_MIRROR_MAX; i++) { bytenr = btrfs_sb_offset(i); fs_info = open_ctree_fs_info(dev, bytenr, root_location, 0, -OPEN_CTREE_PARTIAL); + OPEN_CTREE_PARTIAL | __OPEN_CTREE_RETURN_CHUNK_ROOT); if (fs_info) break; fprintf(stderr, "Could not open root, trying backup super\n"); -- Thanks, Qu > # btrfs inspect dump-tree -b 848986112 /dev/loop0p1 > btrfs-progs v4.15 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found DC09290B wanted C630FD61 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > bytenr mismatch, want=363069440, have=17552567724568668829 > leaf 848986112 items 71 free space 3562 generation 9858294 owner 1 > leaf 848986112 flags 0x1(WRITTEN) backref revision 1 > fs uuid a6459a90-ebe3-4c75-97f4-5496eadcc96f > chunk uuid 6f325a21-ce3e-4994-a638-b88ea82d504c > item 0 key (EXTENT_TREE ROOT_ITEM 0) itemoff 15844 itemsize 439 > generation 9858294 root_dirid 0 bytenr 848789504 level 2 refs > 1 > lastsnap 0 byte_limit 0 bytes_used 16531456 flags 0x0(none) > uuid ---- > drop key (0 UNKNOWN.0 0) level 0 > item 1 key (DEV_TREE ROOT_ITEM 0) itemoff 15405 itemsize 439 > generation 9855433 root_dirid 0 bytenr 30408704 level 0 refs 1 > lastsnap 0 byte_limit 0 bytes_used 16384 flags 0x0(none) > uuid ---- > drop key (0 UNKNOWN.0 0) level 0 > item 2 key (FS_TREE INODE_REF 6) itemoff 15388 itemsize 17 > index 0 namelen 7 name: default > item 3 key (FS_TREE ROOT_ITEM 0) itemoff 14949 itemsize 439 > generation 9858293 root_dirid 256 bytenr 850509824 level 2 > refs 1 > lastsnap 263791 byte_limit 0 bytes_used 213549056 flags > 0x0(none) > uuid ---- > ctransid 9858293 otransid 0 stransid 0 rtransid 0 > ctime 1519807754.11500 (2018-02-28 09:49:14) > drop key (0 UNKNOWN.0 0) level 0 > item 4 key (FS_TREE ROOT_REF 257) itemoff 14924 itemsize 25 > root ref key dirid 258 sequence 2 name i386-pc > item 5 key (FS_TREE ROOT_REF 258) itemoff 14896 itemsize 28 > root ref key dirid 258 sequence 3 name x86_64-efi > item 6 key (FS_TREE ROOT_REF 259) itemoff 14875 itemsize 21 > root ref key dirid 256 sequence 3 name opt > item 7 key (FS_TREE ROOT_REF 260) itemoff 14854 itemsize 21 > root ref key dirid 256 sequence 4 name srv > item 8 key (FS_TREE ROOT_REF 261) itemoff 14833 itemsize 21 > root ref key dirid 256 sequence 5 name tmp > item 9 key (FS_TREE ROOT_REF 262) itemoff 14810 itemsize 23 > root ref key dirid 259 sequence 2 name local > item 10 key (FS_TREE ROOT_REF 263) itemoff 14787 itemsize 23 > root ref key dirid 260 sequence 2 name crash > item 11 key (FS_TREE ROOT_REF 264) itemoff 14762 itemsize 25 > root ref key dirid 261 sequence 2 name mailman > item 12 key (FS_TREE ROOT_REF 265) itemoff 14739 itemsize 23 > root ref key dirid 261 sequence 3 name named > item 13 key (FS_TREE ROOT_REF 266) itemoff 14716 itemsize 23 > root ref key dirid 261 sequence 4 name pgsql > item 14 key (FS_TREE ROOT_REF 267) itemoff 14695 itemsize 21 > root ref key dirid 260 sequence 4 name log > item 15 key (FS_TREE ROOT_REF 268) itemoff 14674 itemsize 21 > root ref key dirid 260 sequence 5 name opt > item 16 key (FS_TREE ROOT_REF 269) itemoff 14651 itemsize 23 > root ref key dirid 260 sequence 6
Re: FS unmountable after RAID/LVM problems
Qu Wenruowrites: > On 2018年03月13日 22:21, Dirk Gouders wrote: >> Qu Wenruo writes: >> >>> On 2018年03月13日 21:29, Dirk Gouders wrote: Qu Wenruo writes: > On 2018年03月13日 21:01, Dirk Gouders wrote: >> Qu Wenruo writes: >> >>> On 2018年03月13日 16:53, Dirk Gouders wrote: >> >> >> find-root: # btrfs-find-root /dev/loop0p1 Superblock thinks the generation is 9858294 Superblock thinks the level is 1 Found tree root at 848773120 gen 9858294 level 1 >>> >>> Tree root is found, find-root won't help much here. >>> And if it's really tree root corruption, we should have some kernel >>> message for it. >>> Well block 832045056(gen: 9858272 level: 1) seems good, but generation/level doesn't match, want gen: 9858294 level: 1 >>> >>> Especially when the next tree block is 22 generation older. >>> >>> Would you please try to call "btrfs inspect dump-tree " and >>> paste the result with *stderr*? >>> >>> At least we could know which tree block is corrupted. >> >> Here is the result of inspect: >> >> # btrfs inspect dump-tree /dev/loop0p1 >> btrfs-progs v4.15 >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> checksum verify failed on 363069440 found DC09290B wanted C630FD61 >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> bytenr mismatch, want=363069440, have=17552567724568668829 >> ERROR: unable to open /dev/loop0p1 > > OK, one tree block in some important tree is corrupted. > > Would you please dump the super block by "btrfs inspect dump-super > " so that we could have some clue about where the corrupted tree > block belongs? Yes, here is the requested output: >>> >>> Sorry, I forgot to add "-f" parameter for dump-super. >>> >>> So what we need is "btrfs inspect dump-super -f /dev/loop0p1". >>> >>> >>> And, what's the version of btrfs-progs? >>> IIRC the latest version of btrfs-progs has loosen the restriction on >>> essential trees for "btrfs inspect dump-tree" if using '-b' option. >>> >>> So along with "dump-super -f" you could also try "btrfs inspect >>> dump-tree -b " >>> Where the number could be any "*_root" value. >>> Like 20971520 (chunk_root) and 848773120 (root) to see if it works. >> >> I am a bit lost ;-) I translated "*_root value" to using both numbers, >> please let me know if I should also use some other numbers. >> >> The version of btrfs-progs is 4.15. >> >> And here is more requested information: >> >> # btrfs inspect dump-super -f /dev/loop0p1 >> superblock: bytenr=65536, device=/dev/loop0p1 >> - >> csum_type 0 (crc32c) >> csum_size 4 >> csum0x31998c61 [match] >> bytenr 65536 >> flags 0x1 >> ( WRITTEN ) >> magic _BHRfS_M [match] >> fsida6459a90-ebe3-4c75-97f4-5496eadcc96f >> label >> generation 9858294 >> root848773120 >> sys_array_size 226 >> chunk_root_generation 18814 >> root_level 1 >> chunk_root 20971520 >> chunk_root_level0 >> log_root0 >> log_root_transid0 >> log_root_level 0 >> total_bytes 10741612544 >> bytes_used 9141452800 >> sectorsize 4096 >> nodesize16384 >> leafsize (deprecated) 16384 >> stripesize 4096 >> root_dir6 >> num_devices 1 >> compat_flags0x0 >> compat_ro_flags 0x0 >> incompat_flags 0x61 >> ( MIXED_BACKREF | >> BIG_METADATA | >> EXTENDED_IREF ) >> cache_generation9858294 >> uuid_tree_generation9824396 >> dev_item.uuid b92bd216-a0bb-467d-8f8f-788f845af30c >> dev_item.fsid a6459a90-ebe3-4c75-97f4-5496eadcc96f [match] >> dev_item.type 0 >> dev_item.total_bytes10741612544 >> dev_item.bytes_used 10741612544 >> dev_item.io_align 4096 >> dev_item.io_width 4096 >> dev_item.sector_size4096 >> dev_item.devid 1 >> dev_item.dev_group 0 >> dev_item.seek_speed 0 >> dev_item.bandwidth 0 >> dev_item.generation 0 >> sys_chunk_array[2048]: >> item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 0) >> length 4194304 owner 2 stripe_len 65536 type SYSTEM >> io_align 4096 io_width 4096 sector_size 4096 >> num_stripes 1 sub_stripes 0 >> stripe 0 devid 1 offset 0 >>
Re: FS unmountable after RAID/LVM problems
On 2018年03月13日 22:21, Dirk Gouders wrote: > Qu Wenruowrites: > >> On 2018年03月13日 21:29, Dirk Gouders wrote: >>> Qu Wenruo writes: >>> On 2018年03月13日 21:01, Dirk Gouders wrote: > Qu Wenruo writes: > >> On 2018年03月13日 16:53, Dirk Gouders wrote: > > > >>> find-root: >>> >>> # btrfs-find-root /dev/loop0p1 >>> Superblock thinks the generation is 9858294 >>> Superblock thinks the level is 1 >>> Found tree root at 848773120 gen 9858294 level 1 >> >> Tree root is found, find-root won't help much here. >> And if it's really tree root corruption, we should have some kernel >> message for it. >> >>> Well block 832045056(gen: 9858272 level: 1) seems good, but >>> generation/level doesn't match, want gen: 9858294 level: 1 >> >> Especially when the next tree block is 22 generation older. >> >> Would you please try to call "btrfs inspect dump-tree " and >> paste the result with *stderr*? >> >> At least we could know which tree block is corrupted. > > Here is the result of inspect: > > # btrfs inspect dump-tree /dev/loop0p1 > btrfs-progs v4.15 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found DC09290B wanted C630FD61 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > bytenr mismatch, want=363069440, have=17552567724568668829 > ERROR: unable to open /dev/loop0p1 OK, one tree block in some important tree is corrupted. Would you please dump the super block by "btrfs inspect dump-super " so that we could have some clue about where the corrupted tree block belongs? >>> >>> Yes, here is the requested output: >> >> Sorry, I forgot to add "-f" parameter for dump-super. >> >> So what we need is "btrfs inspect dump-super -f /dev/loop0p1". >> >> >> And, what's the version of btrfs-progs? >> IIRC the latest version of btrfs-progs has loosen the restriction on >> essential trees for "btrfs inspect dump-tree" if using '-b' option. >> >> So along with "dump-super -f" you could also try "btrfs inspect >> dump-tree -b " >> Where the number could be any "*_root" value. >> Like 20971520 (chunk_root) and 848773120 (root) to see if it works. > > I am a bit lost ;-) I translated "*_root value" to using both numbers, > please let me know if I should also use some other numbers. > > The version of btrfs-progs is 4.15. > > And here is more requested information: > > # btrfs inspect dump-super -f /dev/loop0p1 > superblock: bytenr=65536, device=/dev/loop0p1 > - > csum_type 0 (crc32c) > csum_size 4 > csum0x31998c61 [match] > bytenr 65536 > flags 0x1 > ( WRITTEN ) > magic _BHRfS_M [match] > fsida6459a90-ebe3-4c75-97f4-5496eadcc96f > label > generation 9858294 > root848773120 > sys_array_size 226 > chunk_root_generation 18814 > root_level 1 > chunk_root 20971520 > chunk_root_level0 > log_root0 > log_root_transid0 > log_root_level 0 > total_bytes 10741612544 > bytes_used 9141452800 > sectorsize 4096 > nodesize16384 > leafsize (deprecated) 16384 > stripesize 4096 > root_dir6 > num_devices 1 > compat_flags0x0 > compat_ro_flags 0x0 > incompat_flags 0x61 > ( MIXED_BACKREF | > BIG_METADATA | > EXTENDED_IREF ) > cache_generation9858294 > uuid_tree_generation9824396 > dev_item.uuid b92bd216-a0bb-467d-8f8f-788f845af30c > dev_item.fsid a6459a90-ebe3-4c75-97f4-5496eadcc96f [match] > dev_item.type 0 > dev_item.total_bytes10741612544 > dev_item.bytes_used 10741612544 > dev_item.io_align 4096 > dev_item.io_width 4096 > dev_item.sector_size4096 > dev_item.devid 1 > dev_item.dev_group 0 > dev_item.seek_speed 0 > dev_item.bandwidth 0 > dev_item.generation 0 > sys_chunk_array[2048]: > item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 0) > length 4194304 owner 2 stripe_len 65536 type SYSTEM > io_align 4096 io_width 4096 sector_size 4096 > num_stripes 1 sub_stripes 0 > stripe 0 devid 1 offset 0 > dev_uuid b92bd216-a0bb-467d-8f8f-788f845af30c An old btrfs made by old mkfs.btrfs. Which still has the temporary chunk. > item 1 key (FIRST_CHUNK_TREE
Re: FS unmountable after RAID/LVM problems
Qu Wenruowrites: > On 2018年03月13日 21:29, Dirk Gouders wrote: >> Qu Wenruo writes: >> >>> On 2018年03月13日 21:01, Dirk Gouders wrote: Qu Wenruo writes: > On 2018年03月13日 16:53, Dirk Gouders wrote: >> find-root: >> >> # btrfs-find-root /dev/loop0p1 >> Superblock thinks the generation is 9858294 >> Superblock thinks the level is 1 >> Found tree root at 848773120 gen 9858294 level 1 > > Tree root is found, find-root won't help much here. > And if it's really tree root corruption, we should have some kernel > message for it. > >> Well block 832045056(gen: 9858272 level: 1) seems good, but >> generation/level doesn't match, want gen: 9858294 level: 1 > > Especially when the next tree block is 22 generation older. > > Would you please try to call "btrfs inspect dump-tree " and > paste the result with *stderr*? > > At least we could know which tree block is corrupted. Here is the result of inspect: # btrfs inspect dump-tree /dev/loop0p1 btrfs-progs v4.15 checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D checksum verify failed on 363069440 found DC09290B wanted C630FD61 checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D bytenr mismatch, want=363069440, have=17552567724568668829 ERROR: unable to open /dev/loop0p1 >>> >>> OK, one tree block in some important tree is corrupted. >>> >>> Would you please dump the super block by "btrfs inspect dump-super >>> " so that we could have some clue about where the corrupted tree >>> block belongs? >> >> Yes, here is the requested output: > > Sorry, I forgot to add "-f" parameter for dump-super. > > So what we need is "btrfs inspect dump-super -f /dev/loop0p1". > > > And, what's the version of btrfs-progs? > IIRC the latest version of btrfs-progs has loosen the restriction on > essential trees for "btrfs inspect dump-tree" if using '-b' option. > > So along with "dump-super -f" you could also try "btrfs inspect > dump-tree -b " > Where the number could be any "*_root" value. > Like 20971520 (chunk_root) and 848773120 (root) to see if it works. I am a bit lost ;-) I translated "*_root value" to using both numbers, please let me know if I should also use some other numbers. The version of btrfs-progs is 4.15. And here is more requested information: # btrfs inspect dump-super -f /dev/loop0p1 superblock: bytenr=65536, device=/dev/loop0p1 - csum_type 0 (crc32c) csum_size 4 csum0x31998c61 [match] bytenr 65536 flags 0x1 ( WRITTEN ) magic _BHRfS_M [match] fsida6459a90-ebe3-4c75-97f4-5496eadcc96f label generation 9858294 root848773120 sys_array_size 226 chunk_root_generation 18814 root_level 1 chunk_root 20971520 chunk_root_level0 log_root0 log_root_transid0 log_root_level 0 total_bytes 10741612544 bytes_used 9141452800 sectorsize 4096 nodesize16384 leafsize (deprecated) 16384 stripesize 4096 root_dir6 num_devices 1 compat_flags0x0 compat_ro_flags 0x0 incompat_flags 0x61 ( MIXED_BACKREF | BIG_METADATA | EXTENDED_IREF ) cache_generation9858294 uuid_tree_generation9824396 dev_item.uuid b92bd216-a0bb-467d-8f8f-788f845af30c dev_item.fsid a6459a90-ebe3-4c75-97f4-5496eadcc96f [match] dev_item.type 0 dev_item.total_bytes10741612544 dev_item.bytes_used 10741612544 dev_item.io_align 4096 dev_item.io_width 4096 dev_item.sector_size4096 dev_item.devid 1 dev_item.dev_group 0 dev_item.seek_speed 0 dev_item.bandwidth 0 dev_item.generation 0 sys_chunk_array[2048]: item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 0) length 4194304 owner 2 stripe_len 65536 type SYSTEM io_align 4096 io_width 4096 sector_size 4096 num_stripes 1 sub_stripes 0 stripe 0 devid 1 offset 0 dev_uuid b92bd216-a0bb-467d-8f8f-788f845af30c item 1 key (FIRST_CHUNK_TREE CHUNK_ITEM 20971520) length 8388608 owner 2 stripe_len 65536 type SYSTEM|DUP io_align 65536 io_width 65536 sector_size 4096 num_stripes 2 sub_stripes 0 stripe 0 devid 1 offset 20971520 dev_uuid
Re: FS unmountable after RAID/LVM problems
On 2018年03月13日 21:29, Dirk Gouders wrote: > Qu Wenruowrites: > >> On 2018年03月13日 21:01, Dirk Gouders wrote: >>> Qu Wenruo writes: >>> On 2018年03月13日 16:53, Dirk Gouders wrote: >>> >>> >>> > find-root: > > # btrfs-find-root /dev/loop0p1 > Superblock thinks the generation is 9858294 > Superblock thinks the level is 1 > Found tree root at 848773120 gen 9858294 level 1 Tree root is found, find-root won't help much here. And if it's really tree root corruption, we should have some kernel message for it. > Well block 832045056(gen: 9858272 level: 1) seems good, but > generation/level doesn't match, want gen: 9858294 level: 1 Especially when the next tree block is 22 generation older. Would you please try to call "btrfs inspect dump-tree " and paste the result with *stderr*? At least we could know which tree block is corrupted. >>> >>> Here is the result of inspect: >>> >>> # btrfs inspect dump-tree /dev/loop0p1 >>> btrfs-progs v4.15 >>> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >>> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >>> checksum verify failed on 363069440 found DC09290B wanted C630FD61 >>> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >>> bytenr mismatch, want=363069440, have=17552567724568668829 >>> ERROR: unable to open /dev/loop0p1 >> >> OK, one tree block in some important tree is corrupted. >> >> Would you please dump the super block by "btrfs inspect dump-super >> " so that we could have some clue about where the corrupted tree >> block belongs? > > Yes, here is the requested output: Sorry, I forgot to add "-f" parameter for dump-super. So what we need is "btrfs inspect dump-super -f /dev/loop0p1". And, what's the version of btrfs-progs? IIRC the latest version of btrfs-progs has loosen the restriction on essential trees for "btrfs inspect dump-tree" if using '-b' option. So along with "dump-super -f" you could also try "btrfs inspect dump-tree -b " Where the number could be any "*_root" value. Like 20971520 (chunk_root) and 848773120 (root) to see if it works. Thanks, Qu > > # btrfs inspect dump-super /dev/loop0p1 > superblock: bytenr=65536, device=/dev/loop0p1 > - > csum_type 0 (crc32c) > csum_size 4 > csum0x31998c61 [match] > bytenr 65536 > flags 0x1 > ( WRITTEN ) > magic _BHRfS_M [match] > fsida6459a90-ebe3-4c75-97f4-5496eadcc96f > label > generation 9858294 > root848773120 > sys_array_size 226 > chunk_root_generation 18814 > root_level 1 > chunk_root 20971520 > chunk_root_level0 > log_root0 > log_root_transid0 > log_root_level 0 > total_bytes 10741612544 > bytes_used 9141452800 > sectorsize 4096 > nodesize16384 > leafsize (deprecated) 16384 > stripesize 4096 > root_dir6 > num_devices 1 > compat_flags0x0 > compat_ro_flags 0x0 > incompat_flags 0x61 > ( MIXED_BACKREF | > BIG_METADATA | > EXTENDED_IREF ) > cache_generation9858294 > uuid_tree_generation9824396 > dev_item.uuid b92bd216-a0bb-467d-8f8f-788f845af30c > dev_item.fsid a6459a90-ebe3-4c75-97f4-5496eadcc96f [match] > dev_item.type 0 > dev_item.total_bytes10741612544 > dev_item.bytes_used 10741612544 > dev_item.io_align 4096 > dev_item.io_width 4096 > dev_item.sector_size4096 > dev_item.devid 1 > dev_item.dev_group 0 > dev_item.seek_speed 0 > dev_item.bandwidth 0 > dev_item.generation 0 > > > Thanks, > > Dirk > signature.asc Description: OpenPGP digital signature
Re: FS unmountable after RAID/LVM problems
Qu Wenruowrites: > On 2018年03月13日 21:01, Dirk Gouders wrote: >> Qu Wenruo writes: >> >>> On 2018年03月13日 16:53, Dirk Gouders wrote: >> >> >> find-root: # btrfs-find-root /dev/loop0p1 Superblock thinks the generation is 9858294 Superblock thinks the level is 1 Found tree root at 848773120 gen 9858294 level 1 >>> >>> Tree root is found, find-root won't help much here. >>> And if it's really tree root corruption, we should have some kernel >>> message for it. >>> Well block 832045056(gen: 9858272 level: 1) seems good, but generation/level doesn't match, want gen: 9858294 level: 1 >>> >>> Especially when the next tree block is 22 generation older. >>> >>> Would you please try to call "btrfs inspect dump-tree " and >>> paste the result with *stderr*? >>> >>> At least we could know which tree block is corrupted. >> >> Here is the result of inspect: >> >> # btrfs inspect dump-tree /dev/loop0p1 >> btrfs-progs v4.15 >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> checksum verify failed on 363069440 found DC09290B wanted C630FD61 >> checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D >> bytenr mismatch, want=363069440, have=17552567724568668829 >> ERROR: unable to open /dev/loop0p1 > > OK, one tree block in some important tree is corrupted. > > Would you please dump the super block by "btrfs inspect dump-super > " so that we could have some clue about where the corrupted tree > block belongs? Yes, here is the requested output: # btrfs inspect dump-super /dev/loop0p1 superblock: bytenr=65536, device=/dev/loop0p1 - csum_type 0 (crc32c) csum_size 4 csum0x31998c61 [match] bytenr 65536 flags 0x1 ( WRITTEN ) magic _BHRfS_M [match] fsida6459a90-ebe3-4c75-97f4-5496eadcc96f label generation 9858294 root848773120 sys_array_size 226 chunk_root_generation 18814 root_level 1 chunk_root 20971520 chunk_root_level0 log_root0 log_root_transid0 log_root_level 0 total_bytes 10741612544 bytes_used 9141452800 sectorsize 4096 nodesize16384 leafsize (deprecated) 16384 stripesize 4096 root_dir6 num_devices 1 compat_flags0x0 compat_ro_flags 0x0 incompat_flags 0x61 ( MIXED_BACKREF | BIG_METADATA | EXTENDED_IREF ) cache_generation9858294 uuid_tree_generation9824396 dev_item.uuid b92bd216-a0bb-467d-8f8f-788f845af30c dev_item.fsid a6459a90-ebe3-4c75-97f4-5496eadcc96f [match] dev_item.type 0 dev_item.total_bytes10741612544 dev_item.bytes_used 10741612544 dev_item.io_align 4096 dev_item.io_width 4096 dev_item.sector_size4096 dev_item.devid 1 dev_item.dev_group 0 dev_item.seek_speed 0 dev_item.bandwidth 0 dev_item.generation 0 Thanks, Dirk -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FS unmountable after RAID/LVM problems
On 2018年03月13日 21:01, Dirk Gouders wrote: > Qu Wenruowrites: > >> On 2018年03月13日 16:53, Dirk Gouders wrote: > > > >>> find-root: >>> >>> # btrfs-find-root /dev/loop0p1 >>> Superblock thinks the generation is 9858294 >>> Superblock thinks the level is 1 >>> Found tree root at 848773120 gen 9858294 level 1 >> >> Tree root is found, find-root won't help much here. >> And if it's really tree root corruption, we should have some kernel >> message for it. >> >>> Well block 832045056(gen: 9858272 level: 1) seems good, but >>> generation/level doesn't match, want gen: 9858294 level: 1 >> >> Especially when the next tree block is 22 generation older. >> >> Would you please try to call "btrfs inspect dump-tree " and >> paste the result with *stderr*? >> >> At least we could know which tree block is corrupted. > > Here is the result of inspect: > > # btrfs inspect dump-tree /dev/loop0p1 > btrfs-progs v4.15 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found DC09290B wanted C630FD61 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > bytenr mismatch, want=363069440, have=17552567724568668829 > ERROR: unable to open /dev/loop0p1 OK, one tree block in some important tree is corrupted. Would you please dump the super block by "btrfs inspect dump-super " so that we could have some clue about where the corrupted tree block belongs? Thanks, Qu > > Thanks, > > Dirk > signature.asc Description: OpenPGP digital signature
Re: FS unmountable after RAID/LVM problems
Qu Wenruowrites: > On 2018年03月13日 16:53, Dirk Gouders wrote: >> find-root: >> >> # btrfs-find-root /dev/loop0p1 >> Superblock thinks the generation is 9858294 >> Superblock thinks the level is 1 >> Found tree root at 848773120 gen 9858294 level 1 > > Tree root is found, find-root won't help much here. > And if it's really tree root corruption, we should have some kernel > message for it. > >> Well block 832045056(gen: 9858272 level: 1) seems good, but generation/level >> doesn't match, want gen: 9858294 level: 1 > > Especially when the next tree block is 22 generation older. > > Would you please try to call "btrfs inspect dump-tree " and > paste the result with *stderr*? > > At least we could know which tree block is corrupted. Here is the result of inspect: # btrfs inspect dump-tree /dev/loop0p1 btrfs-progs v4.15 checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D checksum verify failed on 363069440 found DC09290B wanted C630FD61 checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D bytenr mismatch, want=363069440, have=17552567724568668829 ERROR: unable to open /dev/loop0p1 Thanks, Dirk -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FS unmountable after RAID/LVM problems
On 2018年03月13日 16:53, Dirk Gouders wrote: > Hello all, > > a somewhat aged RAID array (16 Disks) got into trouble after it has > been powered off because of facility management maintenance tasks. > > It then went through some rebuilds loosing three disks on the way and > the whole procedure ended with corrupted volumes. Volumes with > ext{2,4} filesystems could be fsck'ed and corresponding VMs then > started but a volume with a (probably) BTRFS partition I am not able > to get very far with. I got no information what filesystems were used > on the corresponding VM but I knew it was an opensSUSE system and > file(1) told me: > > # file -s /dev/loop0p1 > /dev/loop0p1: BTRFS Filesystem sectorsize 4096, nodesize 16384, leafsize > 16384, UUID=a6459a90-ebe3-4c75-97f4-5496eadcc96f, 9141452800/10741612544 > bytes used, 1 devices > > so I am somewhat sure that it was a BTRFS. > > I tried to use some tools on copies of the Volume data and see messages > concerning invalid checksums as well as ones of bad tree block starts > and I'd like to understand what the main issue of that FS might be. > > I'll try to present some information and because I worked only on copies > of the corrupted data, I can provide more information or tests on > request. The kernel on the machine I use for diagnosis is > 4.16.0-rc5-4-gfc6eabbbf8ef. > > Mounting: > > # mount /dev/loop0p1 /mnt/ > mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop0p1, > missing codepage or helper program, or other error. > > dmesg(1) says: > > [ 176.479080] BTRFS: device fsid a6459a90-ebe3-4c75-97f4-5496eadcc96f devid > 1 transid 9858294 /dev/loop0p1 > [ 186.909100] BTRFS info (device loop0p1): disk space caching is enabled > [ 186.990090] BTRFS error (device loop0p1): bad tree block start > 2163788338953595011 212353024 > [ 186.996331] BTRFS error (device loop0p1): bad tree block start > 8619112249313723677 212353024 Logical tree block 212353024 is corrupted. No copy has correct bytenr. > [ 187.044482] BTRFS error (device loop0p1): open_ctree failed Some corruption happened without corresponding kernel message. > > find-root: > > # btrfs-find-root /dev/loop0p1 > Superblock thinks the generation is 9858294 > Superblock thinks the level is 1 > Found tree root at 848773120 gen 9858294 level 1 Tree root is found, find-root won't help much here. And if it's really tree root corruption, we should have some kernel message for it. > Well block 832045056(gen: 9858272 level: 1) seems good, but generation/level > doesn't match, want gen: 9858294 level: 1 Especially when the next tree block is 22 generation older. Would you please try to call "btrfs inspect dump-tree " and paste the result with *stderr*? At least we could know which tree block is corrupted. Thanks, Qu > Well block 831799296(gen: 9858271 level: 1) seems good, but generation/level > doesn't match, want gen: 9858294 level: 1 > Well block 831520768(gen: 9858270 level: 1) seems good, but generation/level > doesn't match, want gen: 9858294 level: 1 > > ...several similar lines that differ only in the block and gen, the > last two lines differ a bit more: > > Well block 72089600(gen: 9728190 level: 0) seems good, but generation/level > doesn't match, want gen: 9858294 level: 1 > Well block 4243456(gen: 3 level: 0) seems good, but generation/level doesn't > match, want gen: 9858294 level: 1 > Well block 4194304(gen: 2 level: 0) seems good, but generation/level doesn't > match, want gen: 9858294 level: 1 > > When I then try a restore with the first block # of the previous command: > > # btrfs restore -t 832045056 -D /dev/loop0p1 /mnt/btrfs/ > parent transid verify failed on 832045056 wanted 9858294 found 9858272 > parent transid verify failed on 832045056 wanted 9858294 found 9858272 > parent transid verify failed on 832045056 wanted 9858294 found 9858272 > parent transid verify failed on 832045056 wanted 9858294 found 9858272 > Ignoring transid failure > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found DC09290B wanted C630FD61 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > bytenr mismatch, want=363069440, have=17552567724568668829 > Could not open root, trying backup super > parent transid verify failed on 832045056 wanted 9858294 found 9858272 > parent transid verify failed on 832045056 wanted 9858294 found 9858272 > parent transid verify failed on 832045056 wanted 9858294 found 9858272 > parent transid verify failed on 832045056 wanted 9858294 found 9858272 > Ignoring transid failure > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > checksum verify failed on 363069440 found DC09290B wanted C630FD61 > checksum verify failed on 363069440 found 296FB15A wanted F0AFE59D > bytenr mismatch, want=363069440,