Re: So, does btrfs check lowmem take days? weeks?
On 07/10/2018 06:53 PM, Su Yue wrote: On 07/10/2018 12:10 PM, Marc MERLIN wrote: On Tue, Jul 10, 2018 at 08:56:15AM +0800, Su Yue wrote: I'm just not clear if my FS is still damaged and btrfsck was just hacked to ignore the damage it can't deal with, or whether it was able to repair things to a consistent state. The fact that I can mount read/write with no errors seems like a good sign. Yes, a good sign. Since extent tree is fixed, the errors left are in other trees. The most bad result I can see is that writes of some files will reports IO Error. This is the cost of RW. Ok, so we agreed that btrfs scrub won't find this, so ultimately I should run normal btrfsck --repair without the special block skip code you added? Yes. Here is the normal btrfsck which skips extent tree to save time. And I fixed a bug which is mentioned in other mail by Qu. I have no time to add progress of fs trees check though. https://github.com/Damenly/btrfs-progs/tree/tmp1 It may take a long time to fix errors unresolved. #./btrfsck -e 2 --mode=lowmem --repair $dev '-e' means to skip extent tree. Here is the mail. Running above command should sloves errors. If no other errors occurs, your FS will be good. Please not run repair of master branch, please :(. It will ruin all things we did in recent days. Thanks, Su Thanks Su Since I can mount the filesystem read/write though, I can probably delete a lot of snapshots to help the next fsck to run. I assume the number of snapshots also affects the amount of memory taken by regular fsck, so maybe if I delete enough of them regular fsck --repair will work again? Thanks, Marc -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 07/10/2018 12:55 PM, Qu Wenruo wrote: On 2018年07月10日 11:50, Marc MERLIN wrote: On Tue, Jul 10, 2018 at 09:34:36AM +0800, Qu Wenruo wrote: Ok, this is where I am now: WARNING: debug: end of checking extent item[18457780273152 169 1] type: 176 offset: 2 checking extent items [18457780273152/18457780273152] ERROR: errors found in extent allocation tree or chunk allocation checking fs roots ERROR: root 17592 EXTENT_DATA[25937109 4096] gap exists, expected: EXTENT_DATA[25937109 4033] The expected end is not even aligned to sectorsize. I think there is something wrong. Dump tree on this INODE would definitely help in this case. Marc, would you please try dump using the following command? # btrfs ins dump-tree -t 17592 | grep -C 40 25937109 Sure, there you go: gargamel:~# btrfs ins dump-tree -t 17592 /dev/mapper/dshelf2 | grep -C 40 25937109 [snip] item 30 key (25937109 INODE_ITEM 0) itemoff 13611 itemsize 160 generation 137680 transid 137680 size 85312 nbytes 85953 block group 0 mode 100644 links 1 uid 500 gid 500 rdev 0 sequence 253 flags 0x0(none) atime 1529023177.0 (2018-06-14 17:39:37) ctime 1529023181.625870411 (2018-06-14 17:39:41) mtime 1528885147.0 (2018-06-13 03:19:07) otime 1529023159.138139719 (2018-06-14 17:39:19) item 31 key (25937109 INODE_REF 14354867) itemoff 13559 itemsize 52 index 33627 namelen 42 name: thumb1024_112_DiveB-1_Oslob_Whaleshark.jpg item 32 key (25937109 EXTENT_DATA 0) itemoff 11563 itemsize 1996 generation 137680 type 0 (inline) inline extent data size 1975 ram_bytes 4033 compression 2 (lzo) item 33 key (25937109 EXTENT_DATA 4033) itemoff 11510 itemsize 53 generation 143349 type 1 (regular) extent data disk byte 0 nr 0 extent data offset 0 nr 63 ram 63 extent compression 0 (none) OK this seems to be caused by btrfs check --repair. (According to the generation difference). Yes, this bug is due to old kernel behavior. I fixed it in new version. Thanks, Su So at least no data loss is caused in term of on-disk data. However I'm not sure if kernel can handle it. Please try to read it with caution, and see if kernel could handle it. (I assume for the latest kernel, tree-checker would detect it and refuse to read) This needs some fix in btrfs check. Thanks, Qu item 34 key (25937109 EXTENT_DATA 4096) itemoff 11457 itemsize 53 generation 137680 type 1 (regular) extent data disk byte 1286516736 nr 4096 extent data offset 0 nr 4096 ram 4096 extent compression 0 (none) item 35 key (25937109 EXTENT_DATA 8192) itemoff 11404 itemsize 53 generation 137680 type 1 (regular) extent data disk byte 1286520832 nr 8192 extent data offset 0 nr 12288 ram 12288 extent compression 2 (lzo) item 36 key (25937109 EXTENT_DATA 20480) itemoff 11351 itemsize 53 generation 137680 type 1 (regular) extent data disk byte 4199424000 nr 65536 extent data offset 0 nr 65536 ram 65536 extent compression 0 (none) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 2018年07月10日 11:50, Marc MERLIN wrote: > On Tue, Jul 10, 2018 at 09:34:36AM +0800, Qu Wenruo wrote: >> Ok, this is where I am now: >> WARNING: debug: end of checking extent item[18457780273152 169 1] >> type: 176 offset: 2 >> checking extent items [18457780273152/18457780273152] >> ERROR: errors found in extent allocation tree or chunk allocation >> checking fs roots >> ERROR: root 17592 EXTENT_DATA[25937109 4096] gap exists, expected: >> EXTENT_DATA[25937109 4033] >> >> The expected end is not even aligned to sectorsize. >> >> I think there is something wrong. >> Dump tree on this INODE would definitely help in this case. >> >> Marc, would you please try dump using the following command? >> >> # btrfs ins dump-tree -t 17592 | grep -C 40 25937109 > > Sure, there you go: > gargamel:~# btrfs ins dump-tree -t 17592 /dev/mapper/dshelf2 | grep -C 40 > 25937109 [snip] > item 30 key (25937109 INODE_ITEM 0) itemoff 13611 itemsize 160 > generation 137680 transid 137680 size 85312 nbytes 85953 > block group 0 mode 100644 links 1 uid 500 gid 500 rdev 0 > sequence 253 flags 0x0(none) > atime 1529023177.0 (2018-06-14 17:39:37) > ctime 1529023181.625870411 (2018-06-14 17:39:41) > mtime 1528885147.0 (2018-06-13 03:19:07) > otime 1529023159.138139719 (2018-06-14 17:39:19) > item 31 key (25937109 INODE_REF 14354867) itemoff 13559 itemsize 52 > index 33627 namelen 42 name: > thumb1024_112_DiveB-1_Oslob_Whaleshark.jpg > item 32 key (25937109 EXTENT_DATA 0) itemoff 11563 itemsize 1996 > generation 137680 type 0 (inline) > inline extent data size 1975 ram_bytes 4033 compression 2 (lzo) > item 33 key (25937109 EXTENT_DATA 4033) itemoff 11510 itemsize 53 > generation 143349 type 1 (regular) > extent data disk byte 0 nr 0 > extent data offset 0 nr 63 ram 63 > extent compression 0 (none) OK this seems to be caused by btrfs check --repair. (According to the generation difference). So at least no data loss is caused in term of on-disk data. However I'm not sure if kernel can handle it. Please try to read it with caution, and see if kernel could handle it. (I assume for the latest kernel, tree-checker would detect it and refuse to read) This needs some fix in btrfs check. Thanks, Qu > item 34 key (25937109 EXTENT_DATA 4096) itemoff 11457 itemsize 53 > generation 137680 type 1 (regular) > extent data disk byte 1286516736 nr 4096 > extent data offset 0 nr 4096 ram 4096 > extent compression 0 (none) > item 35 key (25937109 EXTENT_DATA 8192) itemoff 11404 itemsize 53 > generation 137680 type 1 (regular) > extent data disk byte 1286520832 nr 8192 > extent data offset 0 nr 12288 ram 12288 > extent compression 2 (lzo) > item 36 key (25937109 EXTENT_DATA 20480) itemoff 11351 itemsize 53 > generation 137680 type 1 (regular) > extent data disk byte 4199424000 nr 65536 > extent data offset 0 nr 65536 ram 65536 > extent compression 0 (none) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
To fill in for the spectators on the list :) Su gave me a modified version of btrfsck lowmem that was able to clean most of my filesystem. It's not a general case solution since it had some hardcoding specific to my filesystem problems, but still a great success. Email quoted below, along with responses to Qu On Tue, Jul 10, 2018 at 09:09:33AM +0800, Qu Wenruo wrote: > > > On 2018年07月10日 01:48, Marc MERLIN wrote: > > Success! > > Well done Su, this is a huge improvement to the lowmem code. It went from > > days to less than 3 hours. > > Awesome work! > > > I'll paste the logs below. > > > > Questions: > > 1) I assume I first need to delete a lot of snapshots. What is the limit in > > your opinion? > > 100? 150? other? > > My personal recommendation is just 20. Not 150, not even 100. I see. Then, I may be forced to recreate multiple filesystems anyway. I have about 25 btrfs send/receive relationships and I have around 10 historical snapshots for each. In the future, can't we segment extents/snapshots per subvolume, making subvolumes mini filesystems within the bigger filesystem? > But snapshot deletion will take time (and it's delayed, you won't know > if something wrong happened just after "btrfs subv delete") and even > require a healthy extent tree. > If all extent tree errors are just false alert, that should not be a big > problem at all. > > > > > 2) my filesystem is somewhat misbalanced. Which balance options do you > > think are safe to use? > > I would recommend to manually check extent tree for BLOCK_GROUP_ITEM, > which will tell how big a block group is and how many space is used. > And gives you an idea on which block group can be relocated. > Then use vrange= to specify exact block group to relocation. > > One example would be: > > # btrfs ins dump-tree -t extent | grep -A1 BLOCK_GROUP_ITEM |\ > tee block_group_dump > > Then the output contains: > item 1 key (13631488 BLOCK_GROUP_ITEM 8388608) itemoff 16206 itemsize 24 > block group used 262144 chunk_objectid 256 flags DATA > > The "13631488" is the bytenr of the block group. > The "8388608" is the length of the block group. > The "262144" is the used bytes of the block group. > > The less used space the higher priority it should be relocated. (and > faster to relocate). > You could write a small script to do it, or there should be some tool to > do the calculation for you. I usually use something simpler: Label: 'btrfs_boot' uuid: e4c1daa8-9c39-4a59-b0a9-86297d397f3b Total devices 1 FS bytes used 30.19GiB devid1 size 79.93GiB used 78.01GiB path /dev/mapper/cryptroot This is bad, I have 30GB of data, but 78 out of 80GB of structures full. This is bad news and recommends a balance, correct? If so, I always struggle as to what value I should give to dusage and musage... > And only relocate one block group each time, to avoid possible problem. > > The last but not the least, it's highly recommend to do the relocation > only after unused snapshots are completely deleted. > (Or it would be super super slow to relocate) Thank you for the advise. Hopefully this hepls someone else too, and maybe someone can write some reallocate helper tool if I don't have the time to do it myself. > > 3) Should I start a scrub now (takes about 1 day) or anything else to > > check that the filesystem is hopefully not damaged anymore? > > I would normally recommend to use btrfs check, but neither mode really > works here. > And scrub only checks csum, doesn't check the internal cross reference > (like content of extent tree). > > Maybe Su could skip the whole extent tree check and let lowmem to check > the fs tree only, with --check-data-csum it should be a better work than > scrub. I will wait to hear back from Su, but I think the current situation is that I still have some problems on my FS, they are just 1) not important enough to block mount rw (now it works again) 2) currently ignored by the modified btrfsck I have, but would cause problems if I used real btrfsck. Correct? > > > > 4) should btrfs check reset the corrupt counter? > > bdev /dev/mapper/dshelf2 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 > > for now, should I reset it manually? > > It could be pretty easy to implement if not already implemented. Seems like it's not given that Su's btrfsck --repair ran to completion and I still have corrupt set to '2' :) Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Tue, Jul 10, 2018 at 09:34:36AM +0800, Qu Wenruo wrote: > Ok, this is where I am now: > WARNING: debug: end of checking extent item[18457780273152 169 1] > type: 176 offset: 2 > checking extent items [18457780273152/18457780273152] > ERROR: errors found in extent allocation tree or chunk allocation > checking fs roots > ERROR: root 17592 EXTENT_DATA[25937109 4096] gap exists, expected: > EXTENT_DATA[25937109 4033] > > The expected end is not even aligned to sectorsize. > > I think there is something wrong. > Dump tree on this INODE would definitely help in this case. > > Marc, would you please try dump using the following command? > > # btrfs ins dump-tree -t 17592 | grep -C 40 25937109 Sure, there you go: gargamel:~# btrfs ins dump-tree -t 17592 /dev/mapper/dshelf2 | grep -C 40 25937109 extent data disk byte 3259370151936 nr 114688 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 144 key (2009526 EXTENT_DATA 1179648) itemoff 7931 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259370266624 nr 118784 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 145 key (2009526 EXTENT_DATA 1310720) itemoff 7878 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259370385408 nr 118784 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 146 key (2009526 EXTENT_DATA 1441792) itemoff 7825 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259370504192 nr 118784 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 147 key (2009526 EXTENT_DATA 1572864) itemoff 7772 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259370622976 nr 114688 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 148 key (2009526 EXTENT_DATA 1703936) itemoff 7719 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259370737664 nr 118784 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 149 key (2009526 EXTENT_DATA 1835008) itemoff 7666 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259370856448 nr 118784 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 150 key (2009526 EXTENT_DATA 1966080) itemoff 7613 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259370975232 nr 118784 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 151 key (2009526 EXTENT_DATA 2097152) itemoff 7560 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259371094016 nr 114688 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 152 key (2009526 EXTENT_DATA 2228224) itemoff 7507 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259371208704 nr 114688 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 153 key (2009526 EXTENT_DATA 2359296) itemoff 7454 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259371323392 nr 110592 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 154 key (2009526 EXTENT_DATA 2490368) itemoff 7401 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259371433984 nr 114688 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 155 key (2009526 EXTENT_DATA 2621440) itemoff 7348 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259371548672 nr 110592 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 156 key (2009526 EXTENT_DATA 2752512) itemoff 7295 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259371659264 nr 114688 extent data offset 0 nr 131072 ram 131072 extent compression 1 (zlib) item 157 key (2009526 EXTENT_DATA 2883584) itemoff 7242 itemsize 53 generation 18462 type 1 (regular) extent data disk byte 3259371773952 nr 106496 extent
Re: So, does btrfs check lowmem take days? weeks?
On 2018年07月10日 09:37, Su Yue wrote: > [CC to linux-btrfs] > > Here is the log of wrong extent data. > > On 07/08/2018 01:21 AM, Marc MERLIN wrote: >> On Fri, Jul 06, 2018 at 10:56:36AM -0700, Marc MERLIN wrote: >>> On Fri, Jul 06, 2018 at 09:05:23AM -0700, Marc MERLIN wrote: Ok, this is where I am now: WARNING: debug: end of checking extent item[18457780273152 169 1] type: 176 offset: 2 checking extent items [18457780273152/18457780273152] ERROR: errors found in extent allocation tree or chunk allocation checking fs roots ERROR: root 17592 EXTENT_DATA[25937109 4096] gap exists, expected: EXTENT_DATA[25937109 4033] The expected end is not even aligned to sectorsize. I think there is something wrong. Dump tree on this INODE would definitely help in this case. Marc, would you please try dump using the following command? # btrfs ins dump-tree -t 17592 | grep -C 40 25937109 Thanks, Qu ERROR: root 17592 EXTENT_DATA[25937109 8192] gap exists, expected: EXTENT_DATA[25937109 8129] ERROR: root 17592 EXTENT_DATA[25937109 20480] gap exists, expected: EXTENT_DATA[25937109 20417] ERROR: root 17592 EXTENT_DATA[25937493 4096] gap exists, expected: EXTENT_DATA[25937493 3349] ERROR: root 17592 EXTENT_DATA[25937493 8192] gap exists, expected: EXTENT_DATA[25937493 7445] ERROR: root 17592 EXTENT_DATA[25937493 12288] gap exists, expected: EXTENT_DATA[25937493 11541] ERROR: root 17592 EXTENT_DATA[25941335 4096] gap exists, expected: EXTENT_DATA[25941335 4091] ERROR: root 17592 EXTENT_DATA[25941335 8192] gap exists, expected: EXTENT_DATA[25941335 8187] ERROR: root 17592 EXTENT_DATA[25942002 4096] gap exists, expected: EXTENT_DATA[25942002 4093] ERROR: root 17592 EXTENT_DATA[25942790 4096] gap exists, expected: EXTENT_DATA[25942790 4094] ERROR: root 17592 EXTENT_DATA[25945819 4096] gap exists, expected: EXTENT_DATA[25945819 4093] ERROR: root 17592 EXTENT_DATA[26064834 4096] gap exists, expected: EXTENT_DATA[26064834 129] ERROR: root 17592 EXTENT_DATA[26064834 135168] gap exists, expected: EXTENT_DATA[26064834 131201] ERROR: root 17592 EXTENT_DATA[26064834 266240] gap exists, expected: EXTENT_DATA[26064834 262273] ERROR: root 17592 EXTENT_DATA[26064834 397312] gap exists, expected: EXTENT_DATA[26064834 393345] ERROR: root 17592 EXTENT_DATA[26064834 528384] gap exists, expected: EXTENT_DATA[26064834 524417] ERROR: root 17592 EXTENT_DATA[26064834 659456] gap exists, expected: EXTENT_DATA[26064834 655489] ERROR: root 17592 EXTENT_DATA[26064834 790528] gap exists, expected: EXTENT_DATA[26064834 786561] ERROR: root 17592 EXTENT_DATA[26064834 921600] gap exists, expected: EXTENT_DATA[26064834 917633] ERROR: root 17592 EXTENT_DATA[26064834 929792] gap exists, expected: EXTENT_DATA[26064834 925825] ERROR: root 17592 EXTENT_DATA[26064834 1224704] gap exists, expected: EXTENT_DATA[26064834 1220737] I'm not sure how long it's been stuck on that line. I'll watch it today. >>> >>> Ok, it's been stuck there for 2H. >> >> Well, it's now the next day and it's finished running: >> >> checking extent items [18457780273152/18457780273152] >> ERROR: errors found in extent allocation tree or chunk allocation >> checking fs roots >> ERROR: root 17592 EXTENT_DATA[25937109 4096] gap exists, expected: >> EXTENT_DATA[25937109 4033] >> ERROR: root 17592 EXTENT_DATA[25937109 8192] gap exists, expected: >> EXTENT_DATA[25937109 8129] >> ERROR: root 17592 EXTENT_DATA[25937109 20480] gap exists, expected: >> EXTENT_DATA[25937109 20417] >> ERROR: root 17592 EXTENT_DATA[25937493 4096] gap exists, expected: >> EXTENT_DATA[25937493 3349] >> ERROR: root 17592 EXTENT_DATA[25937493 8192] gap exists, expected: >> EXTENT_DATA[25937493 7445] >> ERROR: root 17592 EXTENT_DATA[25937493 12288] gap exists, expected: >> EXTENT_DATA[25937493 11541] >> ERROR: root 17592 EXTENT_DATA[25941335 4096] gap exists, expected: >> EXTENT_DATA[25941335 4091] >> ERROR: root 17592 EXTENT_DATA[25941335 8192] gap exists, expected: >> EXTENT_DATA[25941335 8187] >> ERROR: root 17592 EXTENT_DATA[25942002 4096] gap exists, expected: >> EXTENT_DATA[25942002 4093] >> ERROR: root 17592 EXTENT_DATA[25942790 4096] gap exists, expected: >> EXTENT_DATA[25942790 4094] >> ERROR: root 17592 EXTENT_DATA[25945819 4096] gap exists, expected: >> EXTENT_DATA[25945819 4093] >> ERROR: root 17592 EXTENT_DATA[26064834 4096] gap exists, expected: >> EXTENT_DATA[26064834 129] >> ERROR: root 17592 EXTENT_DATA[26064834 135168] gap exists, expected: >> EXTENT_DATA[26064834 131201] >> ERROR: root 17592 EXTENT_DATA[26064834 266240] gap exists, expected: >> EXTENT_DATA[26064834 262273] >> ERROR: root 17592 EXTENT_DATA[26064834 397312] gap exists, expected: >> EXTENT_DATA[26064834 393345] >> ERROR: root 17592 EXTENT_DATA[26064834 528384] gap exists, expected: >>
Re: So, does btrfs check lowmem take days? weeks?
[CC to linux-btrfs] Here is the log of wrong extent data. On 07/08/2018 01:21 AM, Marc MERLIN wrote: On Fri, Jul 06, 2018 at 10:56:36AM -0700, Marc MERLIN wrote: On Fri, Jul 06, 2018 at 09:05:23AM -0700, Marc MERLIN wrote: Ok, this is where I am now: WARNING: debug: end of checking extent item[18457780273152 169 1] type: 176 offset: 2 checking extent items [18457780273152/18457780273152] ERROR: errors found in extent allocation tree or chunk allocation checking fs roots ERROR: root 17592 EXTENT_DATA[25937109 4096] gap exists, expected: EXTENT_DATA[25937109 4033] ERROR: root 17592 EXTENT_DATA[25937109 8192] gap exists, expected: EXTENT_DATA[25937109 8129] ERROR: root 17592 EXTENT_DATA[25937109 20480] gap exists, expected: EXTENT_DATA[25937109 20417] ERROR: root 17592 EXTENT_DATA[25937493 4096] gap exists, expected: EXTENT_DATA[25937493 3349] ERROR: root 17592 EXTENT_DATA[25937493 8192] gap exists, expected: EXTENT_DATA[25937493 7445] ERROR: root 17592 EXTENT_DATA[25937493 12288] gap exists, expected: EXTENT_DATA[25937493 11541] ERROR: root 17592 EXTENT_DATA[25941335 4096] gap exists, expected: EXTENT_DATA[25941335 4091] ERROR: root 17592 EXTENT_DATA[25941335 8192] gap exists, expected: EXTENT_DATA[25941335 8187] ERROR: root 17592 EXTENT_DATA[25942002 4096] gap exists, expected: EXTENT_DATA[25942002 4093] ERROR: root 17592 EXTENT_DATA[25942790 4096] gap exists, expected: EXTENT_DATA[25942790 4094] ERROR: root 17592 EXTENT_DATA[25945819 4096] gap exists, expected: EXTENT_DATA[25945819 4093] ERROR: root 17592 EXTENT_DATA[26064834 4096] gap exists, expected: EXTENT_DATA[26064834 129] ERROR: root 17592 EXTENT_DATA[26064834 135168] gap exists, expected: EXTENT_DATA[26064834 131201] ERROR: root 17592 EXTENT_DATA[26064834 266240] gap exists, expected: EXTENT_DATA[26064834 262273] ERROR: root 17592 EXTENT_DATA[26064834 397312] gap exists, expected: EXTENT_DATA[26064834 393345] ERROR: root 17592 EXTENT_DATA[26064834 528384] gap exists, expected: EXTENT_DATA[26064834 524417] ERROR: root 17592 EXTENT_DATA[26064834 659456] gap exists, expected: EXTENT_DATA[26064834 655489] ERROR: root 17592 EXTENT_DATA[26064834 790528] gap exists, expected: EXTENT_DATA[26064834 786561] ERROR: root 17592 EXTENT_DATA[26064834 921600] gap exists, expected: EXTENT_DATA[26064834 917633] ERROR: root 17592 EXTENT_DATA[26064834 929792] gap exists, expected: EXTENT_DATA[26064834 925825] ERROR: root 17592 EXTENT_DATA[26064834 1224704] gap exists, expected: EXTENT_DATA[26064834 1220737] I'm not sure how long it's been stuck on that line. I'll watch it today. Ok, it's been stuck there for 2H. Well, it's now the next day and it's finished running: checking extent items [18457780273152/18457780273152] ERROR: errors found in extent allocation tree or chunk allocation checking fs roots ERROR: root 17592 EXTENT_DATA[25937109 4096] gap exists, expected: EXTENT_DATA[25937109 4033] ERROR: root 17592 EXTENT_DATA[25937109 8192] gap exists, expected: EXTENT_DATA[25937109 8129] ERROR: root 17592 EXTENT_DATA[25937109 20480] gap exists, expected: EXTENT_DATA[25937109 20417] ERROR: root 17592 EXTENT_DATA[25937493 4096] gap exists, expected: EXTENT_DATA[25937493 3349] ERROR: root 17592 EXTENT_DATA[25937493 8192] gap exists, expected: EXTENT_DATA[25937493 7445] ERROR: root 17592 EXTENT_DATA[25937493 12288] gap exists, expected: EXTENT_DATA[25937493 11541] ERROR: root 17592 EXTENT_DATA[25941335 4096] gap exists, expected: EXTENT_DATA[25941335 4091] ERROR: root 17592 EXTENT_DATA[25941335 8192] gap exists, expected: EXTENT_DATA[25941335 8187] ERROR: root 17592 EXTENT_DATA[25942002 4096] gap exists, expected: EXTENT_DATA[25942002 4093] ERROR: root 17592 EXTENT_DATA[25942790 4096] gap exists, expected: EXTENT_DATA[25942790 4094] ERROR: root 17592 EXTENT_DATA[25945819 4096] gap exists, expected: EXTENT_DATA[25945819 4093] ERROR: root 17592 EXTENT_DATA[26064834 4096] gap exists, expected: EXTENT_DATA[26064834 129] ERROR: root 17592 EXTENT_DATA[26064834 135168] gap exists, expected: EXTENT_DATA[26064834 131201] ERROR: root 17592 EXTENT_DATA[26064834 266240] gap exists, expected: EXTENT_DATA[26064834 262273] ERROR: root 17592 EXTENT_DATA[26064834 397312] gap exists, expected: EXTENT_DATA[26064834 393345] ERROR: root 17592 EXTENT_DATA[26064834 528384] gap exists, expected: EXTENT_DATA[26064834 524417] ERROR: root 17592 EXTENT_DATA[26064834 659456] gap exists, expected: EXTENT_DATA[26064834 655489] ERROR: root 17592 EXTENT_DATA[26064834 790528] gap exists, expected: EXTENT_DATA[26064834 786561] ERROR: root 17592 EXTENT_DATA[26064834 921600] gap exists, expected: EXTENT_DATA[26064834 917633] ERROR: root 17592 EXTENT_DATA[26064834 929792] gap exists, expected: EXTENT_DATA[26064834 925825] ERROR: root 17592 EXTENT_DATA[26064834 1224704] gap exists, expected: EXTENT_DATA[26064834 1220737] ERROR: root 21322 EXTENT_DATA[25320803 4096] gap exists, expected: EXTENT_DATA[25320803 56] ERROR: root 21322 EXTENT_DATA[25320803
Re: Fwd: Re: So, does btrfs check lowmem take days? weeks?
Forgot to CC Marc. On 07/10/2018 09:33 AM, Su Yue wrote: [FWD to linux-btrfs] Thanks to Marc's patient of running and tests btrfsck lowmem mode in recent days. The FS has a large extent tree but luckily few are corrupted, they are all fixed by special version. Reloc trees were cleaned too. So the FS can be mounted with RW. However, the remaining errors of extent data in file trees are unresloved, they are all about holes. Since I'm not familiar with kernel code, not sure how serious those errors are and what result could be during write/read those wrong items. Marc also has some questions in the part forwarded, replies are always welcome. Error messages are showed in the last. Forwarded Message Subject: Re: So, does btrfs check lowmem take days? weeks? Date: Mon, 9 Jul 2018 10:48:18 -0700 From: Marc MERLIN To: Su Yue CC: quwenruo.bt...@gmx.com, Su Yue Success! Well done Su, this is a huge improvement to the lowmem code. It went from days to less than 3 hours. I'll paste the logs below. Questions: 1) I assume I first need to delete a lot of snapshots. What is the limit in your opinion? 100? 150? other? 2) my filesystem is somewhat misbalanced. Which balance options do you think are safe to use? 3) Should I start a scrub now (takes about 1 day) or anything else to check that the filesystem is hopefully not damaged anymore? 4) should btrfs check reset the corrupt counter? bdev /dev/mapper/dshelf2 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 for now, should I reset it manually? Thanks, Marc gargamel:/var/local/src/btrfs-progs.sy# ./btrfsck --mode=lowmem -q --repair /dev/mapper/dshelf2 enabling repair mode WARNING: low-memory mode repair support is only partial Checking filesystem on /dev/mapper/dshelf2 UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d Created new chunk [18460145811456 1073741824] Add one extent data backref [84302495744 69632] Add one extent data backref [84302495744 69632] Add one extent data backref [125712527360 12214272] Add one extent data backref [125730848768 5111808] Add one extent data backref [125730848768 5111808] Add one extent data backref [125736914944 6037504] Add one extent data backref [125736914944 6037504] Add one extent data backref [129952120832 20242432] Add one extent data backref [129952120832 20242432] Add one extent data backref [134925357056 11829248] Add one extent data backref [134925357056 11829248] Add one extent data backref [147895111680 12345344] Add one extent data backref [147895111680 12345344] Add one extent data backref [150850146304 17522688] Add one extent data backref [156909494272 55320576] Add one extent data backref [156909494272 55320576] good luck! found 0 bytes used, no error found total csum bytes: 0 total tree bytes: 0 total fs tree bytes: 0 total extent tree bytes: 0 btree space waste bytes: 0 file data blocks allocated: 0 referenced 0 gargamel:/var/local/src/btrfs-progs.sy# ./btrfsck --mode=lowmem -q /dev/mapper/dshelf2 Checking filesystem on /dev/mapper/dshelf2 UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d good luck! found 251650048 bytes used, no error found total csum bytes: 0 total tree bytes: 0 total fs tree bytes: 0 total extent tree bytes: 0 btree space waste bytes: 0 file data blocks allocated: 0 referenced 0 gargamel:/var/local/src/btrfs-progs.sy# ./btrfsck -c /dev/mapper/dshelf2 Checking filesystem on /dev/mapper/dshelf2 UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d found 0 bytes used, no error found total csum bytes: 0 total tree bytes: 0 total fs tree bytes: 0 total extent tree bytes: 0 btree space waste bytes: 0 file data blocks allocated: 0 referenced 0 gargamel:/var/local/src/btrfs-progs.sy# mount /dev/mapper/dshelf2 /mnt/mnt [671283.314558] BTRFS info (device dm-2): disk space caching is enabled [671283.334226] BTRFS info (device dm-2): has skinny extents [671285.191740] BTRFS info (device dm-2): bdev /dev/mapper/dshelf2 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 [671395.371313] BTRFS info (device dm-2): enabling ssd optimizations [671400.884013] BTRFS info (device dm-2): checking UUID tree (hung about 2-3mn but worked eventually) gargamel:/mnt/mnt# btrfs fi show . Label: 'dshelf2' uuid: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d Total devices 1 FS bytes used 12.59TiB devid 1 size 14.55TiB used 13.81TiB path /dev/mapper/dshelf2 gargamel:/mnt/mnt# btrfs fi df . Data, single: total=13.57TiB, used=12.48TiB System, DUP: total=32.00MiB, used=1.55MiB Metadata, DUP: total=124.50GiB, used=116.92GiB Metadata, single: total=216.00MiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=42.62MiB gargamel:/mnt/mnt# btrfs subvolume list . | wc -l 270 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Fwd: Re: So, does btrfs check lowmem take days? weeks?
[FWD to linux-btrfs] Thanks to Marc's patient of running and tests btrfsck lowmem mode in recent days. The FS has a large extent tree but luckily few are corrupted, they are all fixed by special version. Reloc trees were cleaned too. So the FS can be mounted with RW. However, the remaining errors of extent data in file trees are unresloved, they are all about holes. Since I'm not familiar with kernel code, not sure how serious those errors are and what result could be during write/read those wrong items. Marc also has some questions in the part forwarded, replies are always welcome. Error messages are showed in the last. Forwarded Message Subject: Re: So, does btrfs check lowmem take days? weeks? Date: Mon, 9 Jul 2018 10:48:18 -0700 From: Marc MERLIN To: Su Yue CC: quwenruo.bt...@gmx.com, Su Yue Success! Well done Su, this is a huge improvement to the lowmem code. It went from days to less than 3 hours. I'll paste the logs below. Questions: 1) I assume I first need to delete a lot of snapshots. What is the limit in your opinion? 100? 150? other? 2) my filesystem is somewhat misbalanced. Which balance options do you think are safe to use? 3) Should I start a scrub now (takes about 1 day) or anything else to check that the filesystem is hopefully not damaged anymore? 4) should btrfs check reset the corrupt counter? bdev /dev/mapper/dshelf2 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 for now, should I reset it manually? Thanks, Marc gargamel:/var/local/src/btrfs-progs.sy# ./btrfsck --mode=lowmem -q --repair /dev/mapper/dshelf2 enabling repair mode WARNING: low-memory mode repair support is only partial Checking filesystem on /dev/mapper/dshelf2 UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d Created new chunk [18460145811456 1073741824] Add one extent data backref [84302495744 69632] Add one extent data backref [84302495744 69632] Add one extent data backref [125712527360 12214272] Add one extent data backref [125730848768 5111808] Add one extent data backref [125730848768 5111808] Add one extent data backref [125736914944 6037504] Add one extent data backref [125736914944 6037504] Add one extent data backref [129952120832 20242432] Add one extent data backref [129952120832 20242432] Add one extent data backref [134925357056 11829248] Add one extent data backref [134925357056 11829248] Add one extent data backref [147895111680 12345344] Add one extent data backref [147895111680 12345344] Add one extent data backref [150850146304 17522688] Add one extent data backref [156909494272 55320576] Add one extent data backref [156909494272 55320576] good luck! found 0 bytes used, no error found total csum bytes: 0 total tree bytes: 0 total fs tree bytes: 0 total extent tree bytes: 0 btree space waste bytes: 0 file data blocks allocated: 0 referenced 0 gargamel:/var/local/src/btrfs-progs.sy# ./btrfsck --mode=lowmem -q /dev/mapper/dshelf2 Checking filesystem on /dev/mapper/dshelf2 UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d good luck! found 251650048 bytes used, no error found total csum bytes: 0 total tree bytes: 0 total fs tree bytes: 0 total extent tree bytes: 0 btree space waste bytes: 0 file data blocks allocated: 0 referenced 0 gargamel:/var/local/src/btrfs-progs.sy# ./btrfsck -c /dev/mapper/dshelf2 Checking filesystem on /dev/mapper/dshelf2 UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d found 0 bytes used, no error found total csum bytes: 0 total tree bytes: 0 total fs tree bytes: 0 total extent tree bytes: 0 btree space waste bytes: 0 file data blocks allocated: 0 referenced 0 gargamel:/var/local/src/btrfs-progs.sy# mount /dev/mapper/dshelf2 /mnt/mnt [671283.314558] BTRFS info (device dm-2): disk space caching is enabled [671283.334226] BTRFS info (device dm-2): has skinny extents [671285.191740] BTRFS info (device dm-2): bdev /dev/mapper/dshelf2 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 [671395.371313] BTRFS info (device dm-2): enabling ssd optimizations [671400.884013] BTRFS info (device dm-2): checking UUID tree (hung about 2-3mn but worked eventually) gargamel:/mnt/mnt# btrfs fi show . Label: 'dshelf2' uuid: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d Total devices 1 FS bytes used 12.59TiB devid1 size 14.55TiB used 13.81TiB path /dev/mapper/dshelf2 gargamel:/mnt/mnt# btrfs fi df . Data, single: total=13.57TiB, used=12.48TiB System, DUP: total=32.00MiB, used=1.55MiB Metadata, DUP: total=124.50GiB, used=116.92GiB Metadata, single: total=216.00MiB, used=0.00B GlobalReserve, single: total=512.00MiB, used=42.62MiB gargamel:/mnt/mnt# btrfs subvolume list . | wc -l 270 -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 !Error messages bellow:
Re: So, does btrfs check lowmem take days? weeks?
On 07/04/2018 05:40 AM, Marc MERLIN wrote: On Tue, Jul 03, 2018 at 03:34:45PM -0600, Chris Murphy wrote: On Tue, Jul 3, 2018 at 2:34 AM, Su Yue wrote: Yes, extent tree is the hardest part for lowmem mode. I'm quite confident the tool can deal well with file trees(which records metadata about file and directory name, relationships). As for extent tree, I have few confidence due to its complexity. I have to ask again if there's some metadata integrity mask opion Marc should use to try to catch the corruption cause in the first place? His use case really can't afford either mode of btrfs check. And also check is only backward looking, it doesn't show what was happening at the time. And for big file systems, check rapidly doesn't scale at all anyway. And now he's modifying his layout to avoid the problem from happening again which makes it less likely to catch the cause, and get it fixed. I think if he's willing to build a kernel with integrity checker enabled, it should be considered but only if it's likely to reveal why the problem is happening, even if it can't repair the problem once it's happened. He's already in that situation so masked integrity checking is no worse, at least it gives a chance to improve Btrfs rather than it being a mystery how it got corrupt. Yeah, I'm fine waiting a few more ays with this down and gather data if that helps. Thanks! I will write a special version which skips to check wrong extent items and print debug log. And it must run faster to help us locate the stuck problem. Su But due to the size, a full btrfs image may be a bit larger than we want, not counting some confidential data in some filenames. Marc -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 2018年07月04日 06:00, Marc MERLIN wrote: > On Tue, Jul 03, 2018 at 03:46:59PM -0600, Chris Murphy wrote: >> On Tue, Jul 3, 2018 at 2:50 AM, Qu Wenruo wrote: >>> >>> >>> There must be something wrong, however due to the size of the fs, and >>> the complexity of extent tree, I can't tell. >> >> Right, which is why I'm asking if any of the metadata integrity >> checker mask options might reveal what's going wrong? >> >> I guess the big issues are: >> a. compile kernel with CONFIG_BTRFS_FS_CHECK_INTEGRITY=y is necessary >> b. it can come with a high resource burden depending on the mask and >> where the log is being written (write system logs to a different file >> system for sure) >> c. the granularity offered in the integrity checker might not be enough. >> d. might take a while before corruptions are injected before >> corruption is noticed and flagged. > > Back to where I'm at right now. I'm going to delete this filesystem and > start over very soon. Tomorrow or the day after. > I'm happy to get more data off it if someone wants it for posterity, but > I indeed need to recover soon since being with a dead backup server is > not a good place to be in :) Feel free to recover asap, as the extent tree is really too large for human to analyse manually. Thanks, Qu > > Thanks, > Marc > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Tue, Jul 03, 2018 at 03:46:59PM -0600, Chris Murphy wrote: > On Tue, Jul 3, 2018 at 2:50 AM, Qu Wenruo wrote: > > > > > > There must be something wrong, however due to the size of the fs, and > > the complexity of extent tree, I can't tell. > > Right, which is why I'm asking if any of the metadata integrity > checker mask options might reveal what's going wrong? > > I guess the big issues are: > a. compile kernel with CONFIG_BTRFS_FS_CHECK_INTEGRITY=y is necessary > b. it can come with a high resource burden depending on the mask and > where the log is being written (write system logs to a different file > system for sure) > c. the granularity offered in the integrity checker might not be enough. > d. might take a while before corruptions are injected before > corruption is noticed and flagged. Back to where I'm at right now. I'm going to delete this filesystem and start over very soon. Tomorrow or the day after. I'm happy to get more data off it if someone wants it for posterity, but I indeed need to recover soon since being with a dead backup server is not a good place to be in :) Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Tue, Jul 3, 2018 at 2:50 AM, Qu Wenruo wrote: > > > There must be something wrong, however due to the size of the fs, and > the complexity of extent tree, I can't tell. Right, which is why I'm asking if any of the metadata integrity checker mask options might reveal what's going wrong? I guess the big issues are: a. compile kernel with CONFIG_BTRFS_FS_CHECK_INTEGRITY=y is necessary b. it can come with a high resource burden depending on the mask and where the log is being written (write system logs to a different file system for sure) c. the granularity offered in the integrity checker might not be enough. d. might take a while before corruptions are injected before corruption is noticed and flagged. So it might be pointless, no idea. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Tue, Jul 03, 2018 at 03:34:45PM -0600, Chris Murphy wrote: > On Tue, Jul 3, 2018 at 2:34 AM, Su Yue wrote: > > > Yes, extent tree is the hardest part for lowmem mode. I'm quite > > confident the tool can deal well with file trees(which records metadata > > about file and directory name, relationships). > > As for extent tree, I have few confidence due to its complexity. > > I have to ask again if there's some metadata integrity mask opion Marc > should use to try to catch the corruption cause in the first place? > > His use case really can't afford either mode of btrfs check. And also > check is only backward looking, it doesn't show what was happening at > the time. And for big file systems, check rapidly doesn't scale at all > anyway. > > And now he's modifying his layout to avoid the problem from happening > again which makes it less likely to catch the cause, and get it fixed. > I think if he's willing to build a kernel with integrity checker > enabled, it should be considered but only if it's likely to reveal why > the problem is happening, even if it can't repair the problem once > it's happened. He's already in that situation so masked integrity > checking is no worse, at least it gives a chance to improve Btrfs > rather than it being a mystery how it got corrupt. Yeah, I'm fine waiting a few more ays with this down and gather data if that helps. But due to the size, a full btrfs image may be a bit larger than we want, not counting some confidential data in some filenames. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Tue, Jul 3, 2018 at 2:34 AM, Su Yue wrote: > Yes, extent tree is the hardest part for lowmem mode. I'm quite > confident the tool can deal well with file trees(which records metadata > about file and directory name, relationships). > As for extent tree, I have few confidence due to its complexity. I have to ask again if there's some metadata integrity mask opion Marc should use to try to catch the corruption cause in the first place? His use case really can't afford either mode of btrfs check. And also check is only backward looking, it doesn't show what was happening at the time. And for big file systems, check rapidly doesn't scale at all anyway. And now he's modifying his layout to avoid the problem from happening again which makes it less likely to catch the cause, and get it fixed. I think if he's willing to build a kernel with integrity checker enabled, it should be considered but only if it's likely to reveal why the problem is happening, even if it can't repair the problem once it's happened. He's already in that situation so masked integrity checking is no worse, at least it gives a chance to improve Btrfs rather than it being a mystery how it got corrupt. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Tue, Jul 03, 2018 at 04:50:48PM +0800, Qu Wenruo wrote: > > It sounds like there may not be a fix to this problem with the filesystem's > > design, outside of "do not get there, or else". > > It would even be useful for btrfs tools to start computing heuristics and > > output warnings like "you have more than 100 snapshots on this filesystem, > > this is not recommended, please read http://url/; > > This looks pretty doable, but maybe it's better to add some warning at > btrfs progs (both "subvolume snapshot" and "receive"). This is what I meant to say, correct. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 2018年07月03日 12:22, Marc MERLIN wrote: > On Mon, Jul 02, 2018 at 06:31:43PM -0600, Chris Murphy wrote: >> So the idea behind journaled file systems is that journal replay >> enabled mount time "repair" that's faster than an fsck. Already Btrfs >> use cases with big, but not huge, file systems makes btrfs check a >> problem. Either running out of memory or it takes too long. So already >> it isn't scaling as well as ext4 or XFS in this regard. >> >> So what's the future hold? It seems like the goal is that the problems >> must be avoided in the first place rather than to repair them after >> the fact. >> >> Are the problem's Marc is running into understood well enough that >> there can eventually be a fix, maybe even an on-disk format change, >> that prevents such problems from happening in the first place? >> >> Or does it make sense for him to be running with btrfs debug or some >> subset of btrfs integrity checking mask to try to catch the problems >> in the act of them happening? > > Those are all good questions. > To be fair, I cannot claim that btrfs was at fault for whatever filesystem > damage I ended up with. It's very possible that it happened due to a flaky > Sata card that kicked drives off the bus when it shouldn't have. However this still doesn't explain the problem you hit. In theory (well, it's theory by all means), btrfs is fully atomic for its transaction, even for its data (with csum and cow). So even a powerloss/data corruption happens between transactions, we should get the previous trans. There must be something wrong, however due to the size of the fs, and the complexity of extent tree, I can't tell. > Sure in theory a journaling filesystem can recover from unexpected power > loss and drives dropping off at bad times, but I'm going to guess that > btrfs' complexity also means that it has data structures (extent tree?) that > need to be updated completely "or else". I'm wondering if we have some hidden bug somewhere. For extent tree, it's metadata, and is protected by mandatory CoW, it shouldn't be corrupted, unless we have bug in the already complex delayed reference code, or some unexpected behavior (flush/fua failure) due to so many layers (dmcrypt + mdraid). Anyway, if we can't reproduce it in a controlled environment (my VM with pretty small and plain fs), it's really hard to locate the bug. > > I'm obviously ok with a filesystem check being necessary to recover in cases > like this, afterall I still occasionally have to run e2fsck on ext4 too, but > I'm a lot less thrilled with the btrfs situation where basically the repair > tools can either completely crash your kernel, or take days and then either > get stuck in an infinite loop or hit an algorithm that can't scale if you > have too many hardlinks/snapshots. Unfortunately, all the price is paid for the super fast snapshot creation. The tradeoff can not be easily solved. (Another way to implement snapshot is like LVM thin provision, each time a snapshot is created we need to iterate all allocated blocks of the thin LV, which can't scale very well when the fs grows, but makes its mapping management pretty easy. But I think LVM guys have done some trick to improve the performance) > > It sounds like there may not be a fix to this problem with the filesystem's > design, outside of "do not get there, or else". > It would even be useful for btrfs tools to start computing heuristics and > output warnings like "you have more than 100 snapshots on this filesystem, > this is not recommended, please read http://url/; This looks pretty doable, but maybe it's better to add some warning at btrfs progs (both "subvolume snapshot" and "receive"). Thanks, Qu > > Qu, Su, does that sound both reasonable and doable? > > Thanks, > Marc > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 07/03/2018 12:22 PM, Marc MERLIN wrote: On Mon, Jul 02, 2018 at 06:31:43PM -0600, Chris Murphy wrote: So the idea behind journaled file systems is that journal replay enabled mount time "repair" that's faster than an fsck. Already Btrfs use cases with big, but not huge, file systems makes btrfs check a problem. Either running out of memory or it takes too long. So already it isn't scaling as well as ext4 or XFS in this regard. So what's the future hold? It seems like the goal is that the problems must be avoided in the first place rather than to repair them after the fact. Are the problem's Marc is running into understood well enough that there can eventually be a fix, maybe even an on-disk format change, that prevents such problems from happening in the first place? Or does it make sense for him to be running with btrfs debug or some subset of btrfs integrity checking mask to try to catch the problems in the act of them happening? Those are all good questions. To be fair, I cannot claim that btrfs was at fault for whatever filesystem damage I ended up with. It's very possible that it happened due to a flaky Sata card that kicked drives off the bus when it shouldn't have. Sure in theory a journaling filesystem can recover from unexpected power loss and drives dropping off at bad times, but I'm going to guess that btrfs' complexity also means that it has data structures (extent tree?) that need to be updated completely "or else". Yes, extent tree is the hardest part for lowmem mode. I'm quite confident the tool can deal well with file trees(which records metadata about file and directory name, relationships). As for extent tree, I have few confidence due to its complexity. I'm obviously ok with a filesystem check being necessary to recover in cases like this, afterall I still occasionally have to run e2fsck on ext4 too, but I'm a lot less thrilled with the btrfs situation where basically the repair tools can either completely crash your kernel, or take days and then either get stuck in an infinite loop or hit an algorithm that can't scale if you have too many hardlinks/snapshots. It's not surprising that real world filesytems have many snapshots. Original mode repair eats large memory space, so lowmem mode is created to save memory but costs time. The latter is just not robust to handle complex situations. It sounds like there may not be a fix to this problem with the filesystem's design, outside of "do not get there, or else". It would even be useful for btrfs tools to start computing heuristics and output warnings like "you have more than 100 snapshots on this filesystem, this is not recommended, please read http://url/; Qu, Su, does that sound both reasonable and doable? Thanks, Marc -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Mon, Jul 02, 2018 at 06:31:43PM -0600, Chris Murphy wrote: > So the idea behind journaled file systems is that journal replay > enabled mount time "repair" that's faster than an fsck. Already Btrfs > use cases with big, but not huge, file systems makes btrfs check a > problem. Either running out of memory or it takes too long. So already > it isn't scaling as well as ext4 or XFS in this regard. > > So what's the future hold? It seems like the goal is that the problems > must be avoided in the first place rather than to repair them after > the fact. > > Are the problem's Marc is running into understood well enough that > there can eventually be a fix, maybe even an on-disk format change, > that prevents such problems from happening in the first place? > > Or does it make sense for him to be running with btrfs debug or some > subset of btrfs integrity checking mask to try to catch the problems > in the act of them happening? Those are all good questions. To be fair, I cannot claim that btrfs was at fault for whatever filesystem damage I ended up with. It's very possible that it happened due to a flaky Sata card that kicked drives off the bus when it shouldn't have. Sure in theory a journaling filesystem can recover from unexpected power loss and drives dropping off at bad times, but I'm going to guess that btrfs' complexity also means that it has data structures (extent tree?) that need to be updated completely "or else". I'm obviously ok with a filesystem check being necessary to recover in cases like this, afterall I still occasionally have to run e2fsck on ext4 too, but I'm a lot less thrilled with the btrfs situation where basically the repair tools can either completely crash your kernel, or take days and then either get stuck in an infinite loop or hit an algorithm that can't scale if you have too many hardlinks/snapshots. It sounds like there may not be a fix to this problem with the filesystem's design, outside of "do not get there, or else". It would even be useful for btrfs tools to start computing heuristics and output warnings like "you have more than 100 snapshots on this filesystem, this is not recommended, please read http://url/; Qu, Su, does that sound both reasonable and doable? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Mon, Jul 2, 2018 at 8:42 AM, Qu Wenruo wrote: > > > On 2018年07月02日 22:05, Marc MERLIN wrote: >> On Mon, Jul 02, 2018 at 02:22:20PM +0800, Su Yue wrote: Ok, that's 29MB, so it doesn't fit on pastebin: http://marc.merlins.org/tmp/dshelf2_inspect.txt >>> Sorry Marc. After offline communication with Qu, both >>> of us think the filesystem is hard to repair. >>> The filesystem is too large to debug step by step. >>> Every time check and debug spent is too expensive. >>> And it already costs serveral days. >>> >>> Sadly, I am afarid that you have to recreate filesystem >>> and reback up your data. :( >>> >>> Sorry again and thanks for you reports and patient. >> >> I appreciate your help. Honestly I only wanted to help you find why the >> tools aren't working. Fixing filesystems by hand (and remotely via Email >> on top of that), is way too time consuming like you said. >> >> Is the btrfs design flawed in a way that repair tools just cannot repair >> on their own? > > For short and for your case, yes, you can consider repair tool just a > garbage and don't use them at any production system. So the idea behind journaled file systems is that journal replay enabled mount time "repair" that's faster than an fsck. Already Btrfs use cases with big, but not huge, file systems makes btrfs check a problem. Either running out of memory or it takes too long. So already it isn't scaling as well as ext4 or XFS in this regard. So what's the future hold? It seems like the goal is that the problems must be avoided in the first place rather than to repair them after the fact. Are the problem's Marc is running into understood well enough that there can eventually be a fix, maybe even an on-disk format change, that prevents such problems from happening in the first place? Or does it make sense for him to be running with btrfs debug or some subset of btrfs integrity checking mask to try to catch the problems in the act of them happening? -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Mon, Jul 02, 2018 at 10:33:09PM +0500, Roman Mamedov wrote: > On Mon, 2 Jul 2018 08:19:03 -0700 > Marc MERLIN wrote: > > > I actually have fewer snapshots than this per filesystem, but I backup > > more than 10 filesystems. > > If I used as many snapshots as you recommend, that would already be 230 > > snapshots for 10 filesystems :) > > (...once again me with my rsync :) > > If you didn't use send/receive, you wouldn't be required to keep a separate > snapshot trail per filesystem backed up, one trail of snapshots for the entire > backup server would be enough. Rsync everything to subdirs within one > subvolume, then do timed or event-based snapshots of it. You only need more > than one trail if you want different retention policies for different datasets > (e.g. in my case I have 91 and 31 days). This is exactly how I used to do backups before btrfs. I did cp -al backup.olddate backup.newdate rsync -avSH src/ backup.newdate/ You don't even need snapshots or btrfs anymore. Also, sorry to say, but I have different data retention needs for different backups. Some need to rotate more quickly than others, but if you're using rsync, the method I gave above works fine at any rotation interval you need. It is almost as efficient as btrfs on space, but as I said, the time penalty on all those stats for many files was what killed it for me. If I go back to rsync backups (and I'm really unlikely to), then I'd also go back to ext4. There would be no point in dealing with the complexity and fragility of btrfs anymore. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Mon, 2 Jul 2018 08:19:03 -0700 Marc MERLIN wrote: > I actually have fewer snapshots than this per filesystem, but I backup > more than 10 filesystems. > If I used as many snapshots as you recommend, that would already be 230 > snapshots for 10 filesystems :) (...once again me with my rsync :) If you didn't use send/receive, you wouldn't be required to keep a separate snapshot trail per filesystem backed up, one trail of snapshots for the entire backup server would be enough. Rsync everything to subdirs within one subvolume, then do timed or event-based snapshots of it. You only need more than one trail if you want different retention policies for different datasets (e.g. in my case I have 91 and 31 days). -- With respect, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 2018-07-02 11:19, Marc MERLIN wrote: Hi Qu, thanks for the detailled and honest answer. A few comments inline. On Mon, Jul 02, 2018 at 10:42:40PM +0800, Qu Wenruo wrote: For full, it depends. (but for most real world case, it's still flawed) We have small and crafted images as test cases, which btrfs check can repair without problem at all. But such images are *SMALL*, and only have *ONE* type of corruption, which can represent real world case at all. right, they're just unittest images, I understand. 1) Too large fs (especially too many snapshots) The use case (too many snapshots and shared extents, a lot of extents get shared over 1000 times) is in fact a super large challenge for lowmem mode check/repair. It needs O(n^2) or even O(n^3) to check each backref, which hugely slow the progress and make us hard to locate the real bug. So, the non lowmem version would work better, but it's a problem if it doesn't fit in RAM. I've always considered it a grave bug that btrfs check repair can use so much kernel memory that it will crash the entire system. This should not be possible. While it won't help me here, can btrfs check be improved not to suck all the kernel memory, and ideally even allow using swap space if the RAM is not enough? Is btrfs check regular mode still being maintained? I think it's still better than lowmem, correct? 2) Corruption in extent tree and our objective is to mount RW Extent tree is almost useless if we just want to read data. But when we do any write, we needs it and if it goes wrong even a tiny bit, your fs could be damaged really badly. For other corruption, like some fs tree corruption, we could do something to discard some corrupted files, but if it's extent tree, we either mount RO and grab anything we have, or hopes the almost-never-working --init-extent-tree can work (that's mostly miracle). I understand that it's the weak point of btrfs, thanks for explaining. 1) Don't keep too many snapshots. Really, this is the core. For send/receive backup, IIRC it only needs the parent subvolume exists, there is no need to keep the whole history of all those snapshots. You are correct on history. The reason I keep history is because I may want to recover a file from last week or 2 weeks ago after I finally notice that it's gone. I have terabytes of space on the backup server, so it's easier to keep history there than on the client which may not have enough space to keep a month's worth of history. As you know, back when we did tape backups, we also kept history of at least several weeks (usually several months, but that's too much for btrfs snapshots). Bit of a case-study here, but it may be of interest. We do something kind of similar where I work for our internal file servers. We've got daily snapshots of the whole server kept on the server itself for 7 days (we usually see less than 5% of the total amount of data in changes on weekdays, and essentially 0 on weekends, so the snapshots rarely take up more than ab out 25% of the size of the live data), and then we additionally do daily backups which we retain for 6 months. I've written up a short (albeit rather system specific script) for recovering old versions of a file that first scans the snapshots, and then pulls it out of the backups if it's not there. I've found this works remarkably well for our use case (almost all the data on the file server follows a WORM access pattern with most of the files being between 100kB and 100MB in size). We actually did try moving it all over to BTRFS for a while before we finally ended up with the setup we currently have, but aside from the whole issue with massive numbers of snapshots, we found that for us at least, Amanda actually outperforms BTRFS send/receive for everything except full backups and uses less storage space (though that last bit is largely because we use really aggressive compression). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
Hi Qu, thanks for the detailled and honest answer. A few comments inline. On Mon, Jul 02, 2018 at 10:42:40PM +0800, Qu Wenruo wrote: > For full, it depends. (but for most real world case, it's still flawed) > We have small and crafted images as test cases, which btrfs check can > repair without problem at all. > But such images are *SMALL*, and only have *ONE* type of corruption, > which can represent real world case at all. right, they're just unittest images, I understand. > 1) Too large fs (especially too many snapshots) >The use case (too many snapshots and shared extents, a lot of extents >get shared over 1000 times) is in fact a super large challenge for >lowmem mode check/repair. >It needs O(n^2) or even O(n^3) to check each backref, which hugely >slow the progress and make us hard to locate the real bug. So, the non lowmem version would work better, but it's a problem if it doesn't fit in RAM. I've always considered it a grave bug that btrfs check repair can use so much kernel memory that it will crash the entire system. This should not be possible. While it won't help me here, can btrfs check be improved not to suck all the kernel memory, and ideally even allow using swap space if the RAM is not enough? Is btrfs check regular mode still being maintained? I think it's still better than lowmem, correct? > 2) Corruption in extent tree and our objective is to mount RW >Extent tree is almost useless if we just want to read data. >But when we do any write, we needs it and if it goes wrong even a >tiny bit, your fs could be damaged really badly. > >For other corruption, like some fs tree corruption, we could do >something to discard some corrupted files, but if it's extent tree, >we either mount RO and grab anything we have, or hopes the >almost-never-working --init-extent-tree can work (that's mostly >miracle). I understand that it's the weak point of btrfs, thanks for explaining. > 1) Don't keep too many snapshots. >Really, this is the core. >For send/receive backup, IIRC it only needs the parent subvolume >exists, there is no need to keep the whole history of all those >snapshots. You are correct on history. The reason I keep history is because I may want to recover a file from last week or 2 weeks ago after I finally notice that it's gone. I have terabytes of space on the backup server, so it's easier to keep history there than on the client which may not have enough space to keep a month's worth of history. As you know, back when we did tape backups, we also kept history of at least several weeks (usually several months, but that's too much for btrfs snapshots). >Keep the number of snapshots to minimal does greatly improve the >possibility (both manual patch or check repair) of a successful >repair. >Normally I would suggest 4 hourly snapshots, 7 daily snapshots, 12 >monthly snapshots. I actually have fewer snapshots than this per filesystem, but I backup more than 10 filesystems. If I used as many snapshots as you recommend, that would already be 230 snapshots for 10 filesystems :) Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 2018年07月02日 22:05, Marc MERLIN wrote: > On Mon, Jul 02, 2018 at 02:22:20PM +0800, Su Yue wrote: >>> Ok, that's 29MB, so it doesn't fit on pastebin: >>> http://marc.merlins.org/tmp/dshelf2_inspect.txt >>> >> Sorry Marc. After offline communication with Qu, both >> of us think the filesystem is hard to repair. >> The filesystem is too large to debug step by step. >> Every time check and debug spent is too expensive. >> And it already costs serveral days. >> >> Sadly, I am afarid that you have to recreate filesystem >> and reback up your data. :( >> >> Sorry again and thanks for you reports and patient. > > I appreciate your help. Honestly I only wanted to help you find why the > tools aren't working. Fixing filesystems by hand (and remotely via Email > on top of that), is way too time consuming like you said. > > Is the btrfs design flawed in a way that repair tools just cannot repair > on their own? For short and for your case, yes, you can consider repair tool just a garbage and don't use them at any production system. For full, it depends. (but for most real world case, it's still flawed) We have small and crafted images as test cases, which btrfs check can repair without problem at all. But such images are *SMALL*, and only have *ONE* type of corruption, which can represent real world case at all. > I understand that data can be lost, but I don't understand how the tools > just either keep crashing for me, go in infinite loops, or otherwise > fail to give me back a stable filesystem, even if some data is missing > after that. There are several reasons here that repair tool can't help much: 1) Too large fs (especially too many snapshots) The use case (too many snapshots and shared extents, a lot of extents get shared over 1000 times) is in fact a super large challenge for lowmem mode check/repair. It needs O(n^2) or even O(n^3) to check each backref, which hugely slow the progress and make us hard to locate the real bug. 2) Corruption in extent tree and our objective is to mount RW Extent tree is almost useless if we just want to read data. But when we do any write, we needs it and if it goes wrong even a tiny bit, your fs could be damaged really badly. For other corruption, like some fs tree corruption, we could do something to discard some corrupted files, but if it's extent tree, we either mount RO and grab anything we have, or hopes the almost-never-working --init-extent-tree can work (that's mostly miracle). So, I feel very sorry that we can't provide enough help for your case. But still, we hope to provide some tips on next build if you still want to choose btrfs. 1) Don't keep too many snapshots. Really, this is the core. For send/receive backup, IIRC it only needs the parent subvolume exists, there is no need to keep the whole history of all those snapshots. Keep the number of snapshots to minimal does greatly improve the possibility (both manual patch or check repair) of a successful repair. Normally I would suggest 4 hourly snapshots, 7 daily snapshots, 12 monthly snapshots. 2) Don't keep unrelated snapshots in one btrfs. I totally understand that maintain different btrfs would hugely add maintenance pressure, but as explains, all snapshots share one fragile extent tree. If we limit the fragile extent tree from each other fs, it's less possible a single extent tree corruption to take down the whole fs. Thanks, Qu > > Thanks, > Marc > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Mon, Jul 02, 2018 at 02:22:20PM +0800, Su Yue wrote: > > Ok, that's 29MB, so it doesn't fit on pastebin: > > http://marc.merlins.org/tmp/dshelf2_inspect.txt > > > Sorry Marc. After offline communication with Qu, both > of us think the filesystem is hard to repair. > The filesystem is too large to debug step by step. > Every time check and debug spent is too expensive. > And it already costs serveral days. > > Sadly, I am afarid that you have to recreate filesystem > and reback up your data. :( > > Sorry again and thanks for you reports and patient. I appreciate your help. Honestly I only wanted to help you find why the tools aren't working. Fixing filesystems by hand (and remotely via Email on top of that), is way too time consuming like you said. Is the btrfs design flawed in a way that repair tools just cannot repair on their own? I understand that data can be lost, but I don't understand how the tools just either keep crashing for me, go in infinite loops, or otherwise fail to give me back a stable filesystem, even if some data is missing after that. Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 07/02/2018 11:22 AM, Marc MERLIN wrote: On Mon, Jul 02, 2018 at 10:02:33AM +0800, Su Yue wrote: Could you try follow dumps? They shouldn't cost much time. #btrfs inspect dump-tree -t 21872 | grep -C 50 "374857 EXTENT_DATA " #btrfs inspect dump-tree -t 22911 | grep -C 50 "374857 EXTENT_DATA " Ok, that's 29MB, so it doesn't fit on pastebin: http://marc.merlins.org/tmp/dshelf2_inspect.txt Sorry Marc. After offline communication with Qu, both of us think the filesystem is hard to repair. The filesystem is too large to debug step by step. Every time check and debug spent is too expensive. And it already costs serveral days. Sadly, I am afarid that you have to recreate filesystem and reback up your data. :( Sorry again and thanks for you reports and patient. Su Marc -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Mon, Jul 02, 2018 at 10:02:33AM +0800, Su Yue wrote: > Could you try follow dumps? They shouldn't cost much time. > > #btrfs inspect dump-tree -t 21872 | grep -C 50 "374857 > EXTENT_DATA " > > #btrfs inspect dump-tree -t 22911 | grep -C 50 "374857 > EXTENT_DATA " Ok, that's 29MB, so it doesn't fit on pastebin: http://marc.merlins.org/tmp/dshelf2_inspect.txt Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 07/02/2018 07:22 AM, Marc MERLIN wrote: On Thu, Jun 28, 2018 at 11:43:54PM -0700, Marc MERLIN wrote: On Fri, Jun 29, 2018 at 02:32:44PM +0800, Su Yue wrote: https://github.com/Damenly/btrfs-progs/tree/tmp1 Not sure if I undertand that you meant, here. Sorry for my unclear words. Simply speaking, I suggest you to stop current running check. Then, clone above branch to compile binary then run 'btrfs check --mode=lowmem $dev'. I understand, I'll build and try it. This filesystem is trash to me and will require over a week to rebuild manually if I can't repair it. Understood your anxiety, a log of check without '--repair' will help us to make clear what's wrong with your filesystem. Ok, I'll run your new code without repair and report back. It will likely take over a day though. Well, it got stuck for over a day, and then I had to reboot :( saruman:/var/local/src/btrfs-progs.sy# git remote -v origin https://github.com/Damenly/btrfs-progs.git (fetch) origin https://github.com/Damenly/btrfs-progs.git (push) saruman:/var/local/src/btrfs-progs.sy# git branch master * tmp1 saruman:/var/local/src/btrfs-progs.sy# git pull Already up to date. saruman:/var/local/src/btrfs-progs.sy# make Making all in Documentation make[1]: Nothing to be done for 'all'. However, it still got stuck here: Thanks, I saw. Some Clues found. Could you try follow dumps? They shouldn't cost much time. #btrfs inspect dump-tree -t 21872 | grep -C 50 "374857 EXTENT_DATA " #btrfs inspect dump-tree -t 22911 | grep -C 50 "374857 EXTENT_DATA " Thanks, Su gargamel:~# btrfs check --mode=lowmem -p /dev/mapper/dshelf2 Checking filesystem on /dev/mapper/dshelf2 UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d ERROR: extent[84302495744, 69632] referencer count mismatch (root: 21872, owner: 374857, offset: 3407872) wanted: 2 have: 3 ERROR: extent[84302495744, 69632] referencer count mismatch (root: 22911, owner: 374857, offset: 3407872) wanted: 2 have: 4 ERROR: extent[125712527360, 12214272] referencer count mismatch (root: 21872, owner: 374857, offset: 114540544) wan d: 180, have: 181 ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 21872, owner: 374857, offset: 126754816) want : 67, have: 68 ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 22911, owner: 374857, offset: 126754816) want : 67, have: 115 ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 21872, owner: 374857, offset: 131866624) want : 114, have: 115 ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 22911, owner: 374857, offset: 131866624) want : 114, have: 143 ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 21872, owner: 374857, offset: 148234240) wan d: 301, have: 302 ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 22911, owner: 374857, offset: 148234240) wan d: 355, have: 433 ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 21872, owner: 374857, offset: 180371456) wan d: 160, have: 161 ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 22911, owner: 374857, offset: 180371456) wan d: 161, have: 240 ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 21872, owner: 374857, offset: 192200704) wan d: 169, have: 170 ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 22911, owner: 374857, offset: 192200704) wan d: 171, have: 251 ERROR: extent[150850146304, 17522688] referencer count mismatch (root: 21872, owner: 374857, offset: 217653248) wan d: 347, have: 348 ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 22911, owner: 374857, offset: 235175936) wan d: 1, have: 1449 ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, owner: 374857, offset: 235175936) wan d: 1, have: 556 What should I try next? Thanks, Marc -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Thu, Jun 28, 2018 at 11:43:54PM -0700, Marc MERLIN wrote: > On Fri, Jun 29, 2018 at 02:32:44PM +0800, Su Yue wrote: > > > > https://github.com/Damenly/btrfs-progs/tree/tmp1 > > > > > > Not sure if I undertand that you meant, here. > > > > > Sorry for my unclear words. > > Simply speaking, I suggest you to stop current running check. > > Then, clone above branch to compile binary then run > > 'btrfs check --mode=lowmem $dev'. > > I understand, I'll build and try it. > > > > This filesystem is trash to me and will require over a week to rebuild > > > manually if I can't repair it. > > > > Understood your anxiety, a log of check without '--repair' will help > > us to make clear what's wrong with your filesystem. > > Ok, I'll run your new code without repair and report back. It will > likely take over a day though. Well, it got stuck for over a day, and then I had to reboot :( saruman:/var/local/src/btrfs-progs.sy# git remote -v origin https://github.com/Damenly/btrfs-progs.git (fetch) origin https://github.com/Damenly/btrfs-progs.git (push) saruman:/var/local/src/btrfs-progs.sy# git branch master * tmp1 saruman:/var/local/src/btrfs-progs.sy# git pull Already up to date. saruman:/var/local/src/btrfs-progs.sy# make Making all in Documentation make[1]: Nothing to be done for 'all'. However, it still got stuck here: gargamel:~# btrfs check --mode=lowmem -p /dev/mapper/dshelf2 Checking filesystem on /dev/mapper/dshelf2 UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d ERROR: extent[84302495744, 69632] referencer count mismatch (root: 21872, owner: 374857, offset: 3407872) wanted: 2 have: 3 ERROR: extent[84302495744, 69632] referencer count mismatch (root: 22911, owner: 374857, offset: 3407872) wanted: 2 have: 4 ERROR: extent[125712527360, 12214272] referencer count mismatch (root: 21872, owner: 374857, offset: 114540544) wan d: 180, have: 181 ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 21872, owner: 374857, offset: 126754816) want : 67, have: 68 ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 22911, owner: 374857, offset: 126754816) want : 67, have: 115 ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 21872, owner: 374857, offset: 131866624) want : 114, have: 115 ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 22911, owner: 374857, offset: 131866624) want : 114, have: 143 ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 21872, owner: 374857, offset: 148234240) wan d: 301, have: 302 ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 22911, owner: 374857, offset: 148234240) wan d: 355, have: 433 ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 21872, owner: 374857, offset: 180371456) wan d: 160, have: 161 ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 22911, owner: 374857, offset: 180371456) wan d: 161, have: 240 ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 21872, owner: 374857, offset: 192200704) wan d: 169, have: 170 ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 22911, owner: 374857, offset: 192200704) wan d: 171, have: 251 ERROR: extent[150850146304, 17522688] referencer count mismatch (root: 21872, owner: 374857, offset: 217653248) wan d: 347, have: 348 ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 22911, owner: 374857, offset: 235175936) wan d: 1, have: 1449 ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, owner: 374857, offset: 235175936) wan d: 1, have: 556 What should I try next? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Sat, Jun 30, 2018 at 10:49:07PM +0800, Qu Wenruo wrote: > But the last abort looks pretty possible to be the culprit. > > Would you try to dump the extent tree? > # btrfs inspect dump-tree -t extent | grep -A50 156909494272 Sure, there you go: item 25 key (156909494272 EXTENT_ITEM 55320576) itemoff 14943 itemsize 24 refs 19715 gen 31575 flags DATA item 26 key (156909494272 EXTENT_DATA_REF 571620086735451015) itemoff 14915 itemsize 28 extent data backref root 21641 objectid 374857 offset 235175936 count 1452 item 27 key (156909494272 EXTENT_DATA_REF 1765833482087969671) itemoff 14887 itemsize 28 extent data backref root 23094 objectid 374857 offset 235175936 count 1442 item 28 key (156909494272 EXTENT_DATA_REF 1807626434455810951) itemoff 14859 itemsize 28 extent data backref root 21503 objectid 374857 offset 235175936 count 1454 item 29 key (156909494272 EXTENT_DATA_REF 1879818091602916231) itemoff 14831 itemsize 28 extent data backref root 21462 objectid 374857 offset 235175936 count 1454 item 30 key (156909494272 EXTENT_DATA_REF 3610854505775117191) itemoff 14803 itemsize 28 extent data backref root 23134 objectid 374857 offset 235175936 count 1442 item 31 key (156909494272 EXTENT_DATA_REF 3754675454231458695) itemoff 14775 itemsize 28 extent data backref root 23052 objectid 374857 offset 235175936 count 1442 item 32 key (156909494272 EXTENT_DATA_REF 5060494667839714183) itemoff 14747 itemsize 28 extent data backref root 23174 objectid 374857 offset 235175936 count 1440 item 33 key (156909494272 EXTENT_DATA_REF 5476627808561673095) itemoff 14719 itemsize 28 extent data backref root 22911 objectid 374857 offset 235175936 count 1 item 34 key (156909494272 EXTENT_DATA_REF 6378484416458011527) itemoff 14691 itemsize 28 extent data backref root 23012 objectid 374857 offset 235175936 count 1442 item 35 key (156909494272 EXTENT_DATA_REF 7338474132555182983) itemoff 14663 itemsize 28 extent data backref root 21872 objectid 374857 offset 235175936 count 1 item 36 key (156909494272 EXTENT_DATA_REF 7516565391717970823) itemoff 14635 itemsize 28 extent data backref root 21826 objectid 374857 offset 235175936 count 1452 item 37 key (156909494272 SHARED_DATA_REF 14871537025024) itemoff 14631 itemsize 4 shared data backref count 10 item 38 key (156909494272 SHARED_DATA_REF 14871617568768) itemoff 14627 itemsize 4 shared data backref count 73 item 39 key (156909494272 SHARED_DATA_REF 14871619846144) itemoff 14623 itemsize 4 shared data backref count 59 item 40 key (156909494272 SHARED_DATA_REF 14871623270400) itemoff 14619 itemsize 4 shared data backref count 68 item 41 key (156909494272 SHARED_DATA_REF 14871623532544) itemoff 14615 itemsize 4 shared data backref count 70 item 42 key (156909494272 SHARED_DATA_REF 14871626383360) itemoff 14611 itemsize 4 shared data backref count 76 item 43 key (156909494272 SHARED_DATA_REF 14871635132416) itemoff 14607 itemsize 4 shared data backref count 60 item 44 key (156909494272 SHARED_DATA_REF 14871649533952) itemoff 14603 itemsize 4 shared data backref count 79 item 45 key (156909494272 SHARED_DATA_REF 14871862378496) itemoff 14599 itemsize 4 shared data backref count 70 item 46 key (156909494272 SHARED_DATA_REF 14909667098624) itemoff 14595 itemsize 4 shared data backref count 72 item 47 key (156909494272 SHARED_DATA_REF 14909669720064) itemoff 14591 itemsize 4 shared data backref count 58 item 48 key (156909494272 SHARED_DATA_REF 14909734567936) itemoff 14587 itemsize 4 shared data backref count 73 item 49 key (156909494272 SHARED_DATA_REF 14909920477184) itemoff 14583 itemsize 4 shared data backref count 79 item 50 key (156909494272 SHARED_DATA_REF 14942279335936) itemoff 14579 itemsize 4 shared data backref count 79 item 51 key (156909494272 SHARED_DATA_REF 14942304862208) itemoff 14575 itemsize 4 shared data backref count 72 item 52 key (156909494272 SHARED_DATA_REF 14942348378112) itemoff 14571 itemsize 4 shared data backref count 67 item 53 key (156909494272 SHARED_DATA_REF 14942366138368) itemoff 14567 itemsize 4 shared data backref count 51 item 54 key (156909494272 SHARED_DATA_REF 14942384799744) itemoff 14563 itemsize 4 shared data backref count 64 item 55 key (156909494272 SHARED_DATA_REF 14978234613760)
Re: So, does btrfs check lowmem take days? weeks?
On 2018年06月30日 10:44, Marc MERLIN wrote: > Well, there goes that. After about 18H: > ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, > owner: 374857, offset: 235175936) wanted: 1, have: 1452 > backref.c:466: __add_missing_keys: Assertion `ref->root_id` failed, value 0 > btrfs(+0x3a232)[0x56091704f232] > btrfs(+0x3ab46)[0x56091704fb46] > btrfs(+0x3b9f5)[0x5609170509f5] > btrfs(btrfs_find_all_roots+0x9)[0x560917050a45] > btrfs(+0x572ff)[0x56091706c2ff] > btrfs(+0x60b13)[0x560917075b13] > btrfs(cmd_check+0x2634)[0x56091707d431] > btrfs(main+0x88)[0x560917027260] > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f93aa508561] > btrfs(_start+0x2a)[0x560917026dfa] > Aborted I think that's the root cause. Some invalid extent tree backref or bad tree block blow up backref code. All previous error message may be garbage unless you're using Su's latest branch, as lowmem mode tends to report false alerts on refrencer count mismatch. But the last abort looks pretty possible to be the culprit. Would you try to dump the extent tree? # btrfs inspect dump-tree -t extent | grep -A50 156909494272 It should help us locate the culprit and hopefully get some chance to fix it. Thanks, Qu > > That's https://github.com/Damenly/btrfs-progs.git > > Whoops, I didn't use the tmp1 branch, let me try again with that and > report back, although the problem above is still going to be there since > I think the only difference will be this, correct? > https://github.com/Damenly/btrfs-progs/commit/b5851513a12237b3e19a3e71f3ad00b966d25b3a > > Marc > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
Well, there goes that. After about 18H: ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, owner: 374857, offset: 235175936) wanted: 1, have: 1452 backref.c:466: __add_missing_keys: Assertion `ref->root_id` failed, value 0 btrfs(+0x3a232)[0x56091704f232] btrfs(+0x3ab46)[0x56091704fb46] btrfs(+0x3b9f5)[0x5609170509f5] btrfs(btrfs_find_all_roots+0x9)[0x560917050a45] btrfs(+0x572ff)[0x56091706c2ff] btrfs(+0x60b13)[0x560917075b13] btrfs(cmd_check+0x2634)[0x56091707d431] btrfs(main+0x88)[0x560917027260] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7f93aa508561] btrfs(_start+0x2a)[0x560917026dfa] Aborted That's https://github.com/Damenly/btrfs-progs.git Whoops, I didn't use the tmp1 branch, let me try again with that and report back, although the problem above is still going to be there since I think the only difference will be this, correct? https://github.com/Damenly/btrfs-progs/commit/b5851513a12237b3e19a3e71f3ad00b966d25b3a Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
I've got about 1/2 the snapshots and less than 1/10th the data...but my btrfs check times are much shorter than either: 15 minutes and 65 minutes (lowmem). [chris@f28s ~]$ sudo btrfs fi us /mnt/first Overall: Device size:1024.00GiB Device allocated: 774.12GiB Device unallocated: 249.87GiB Device missing: 0.00B Used: 760.48GiB Free (estimated): 256.95GiB(min: 132.01GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB(used: 0.00B) Data,single: Size:761.00GiB, Used:753.93GiB /dev/mapper/first 761.00GiB Metadata,DUP: Size:6.50GiB, Used:3.28GiB /dev/mapper/first 13.00GiB System,DUP: Size:64.00MiB, Used:112.00KiB /dev/mapper/first 128.00MiB Unallocated: /dev/mapper/first 249.87GiB 146 subvolumes 137 snapshots total csum bytes: 790549924 total tree bytes: 3519250432 total fs tree bytes: 2546073600 total extent tree bytes: 131350528 Original mode check takes ~15 minutes Lowmem mode takes ~65 minutes RAM: 4G CPU: Intel(R) Pentium(R) CPU N3700 @ 1.60GHz Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Fri, Jun 29, 2018 at 12:28:31AM -0700, Marc MERLIN wrote: > So, I rebooted, and will now run Su's btrfs check without repair and > report back. As expected, it will likely still take days, here's the start: gargamel:~# btrfs check --mode=lowmem -p /dev/mapper/dshelf2 Checking filesystem on /dev/mapper/dshelf2 UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d ERROR: extent[84302495744, 69632] referencer count mismatch (root: 21872, owner: 374857, offset: 3407872) wanted: 2, have: 4 ERROR: extent[84302495744, 69632] referencer count mismatch (root: 22911, owner: 374857, offset: 3407872) wanted: 2, have: 4 ERROR: extent[125712527360, 12214272] referencer count mismatch (root: 21872, owner: 374857, offset: 114540544) wanted: 180, have: 240 ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 21872, owner: 374857, offset: 126754816) wanted: 67, have: 115 ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 22911, owner: 374857, offset: 126754816) wanted: 67, have: 115 ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 21872, owner: 374857, offset: 131866624) wanted: 114, have: 143 ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 22911, owner: 374857, offset: 131866624) wanted: 114, have: 143 ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 21872, owner: 374857, offset: 148234240) wanted: 301, have: 431 ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 22911, owner: 374857, offset: 148234240) wanted: 355, have: 433 ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 21872, owner: 374857, offset: 180371456) wanted: 160, have: 240 ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 22911, owner: 374857, offset: 180371456) wanted: 161, have: 240 ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 21872, owner: 374857, offset: 192200704) wanted: 169, have: 249 ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 22911, owner: 374857, offset: 192200704) wanted: 171, have: 251 ERROR: extent[150850146304, 17522688] referencer count mismatch (root: 21872, owner: 374857, offset: 217653248) wanted: 347, have: 418 ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 22911, owner: 374857, offset: 235175936) wanted: 1, have: 1449 ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, owner: 374857, offset: 235175936) wanted: 1, have: 1452 Mmmh, these look similar (but not identical) to the last run earlier in this thread: ERROR: extent[84302495744, 69632] referencer count mismatch (root: 21872, owner: 374857, offset: 3407872) wanted: 3, have: 4 Created new chunk [18457780224000 1073741824] Delete backref in extent [84302495744 69632] ERROR: extent[84302495744, 69632] referencer count mismatch (root: 22911, owner: 374857, offset: 3407872) wanted: 3, have: 4 Delete backref in extent [84302495744 69632] ERROR: extent[125712527360, 12214272] referencer count mismatch (root: 21872, owner: 374857, offset: 114540544) wanted: 181, have: 240 Delete backref in extent [125712527360 12214272] ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 21872, owner: 374857, offset: 126754816) wanted: 68, have: 115 Delete backref in extent [125730848768 5111808] ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 22911, owner: 374857, offset: 126754816) wanted: 68, have: 115 Delete backref in extent [125730848768 5111808] ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 21872, owner: 374857, offset: 131866624) wanted: 115, have: 143 Delete backref in extent [125736914944 6037504] ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 22911, owner: 374857, offset: 131866624) wanted: 115, have: 143 Delete backref in extent [125736914944 6037504] ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 21872, owner: 374857, offset: 148234240) wanted: 302, have: 431 Delete backref in extent [129952120832 20242432] ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 22911, owner: 374857, offset: 148234240) wanted: 356, have: 433 Delete backref in extent [129952120832 20242432] ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 21872, owner: 374857, offset: 180371456) wanted: 161, have: 240 Delete backref in extent [134925357056 11829248] ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 22911, owner: 374857, offset: 180371456) wanted: 162, have: 240 Delete backref in extent [134925357056 11829248] ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 21872, owner: 374857, offset: 192200704) wanted: 170, have: 249 Delete backref in extent [147895111680 12345344] ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 22911, owner: 374857, offset: 192200704) wanted: 172, have: 251 Delete backref in extent
Re: So, does btrfs check lowmem take days? weeks?
Hi, On 29/06/2018 09:22, Marc MERLIN wrote: > On Fri, Jun 29, 2018 at 12:09:54PM +0500, Roman Mamedov wrote: >> On Thu, 28 Jun 2018 23:59:03 -0700 >> Marc MERLIN wrote: >> >>> I don't waste a week recreating the many btrfs send/receive relationships. >> Consider not using send/receive, and switching to regular rsync instead. >> Send/receive is very limiting and cumbersome, including because of what you >> described. And it doesn't gain you much over an incremental rsync. As for > Err, sorry but I cannot agree with you here, at all :) > > btrfs send/receive is pretty much the only reason I use btrfs. > rsync takes hours on big filesystems scanning every single inode on both > sides and then seeing what changed, and only then sends the differences > It's super inefficient. > btrfs send knows in seconds what needs to be sent, and works on it right > away. I've not yet tried send/receive but I feel the pain of rsyncing millions of files (I had to use lsyncd to limit the problem to the time the origin servers reboot which is a relatively rare event) so this thread picked my attention. Looking at the whole thread I wonder if you could get a more manageable solution by splitting the filesystem. If instead of using a single BTRFS filesystem you used LVM volumes (maybe with Thin provisioning and monitoring of the volume group free space) for each of your servers to backup with one BTRFS filesystem per volume you would have less snapshots per filesystem and isolate problems in case of corruption. If you eventually decide to start from scratch again this might help a lot in your case. Lionel -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Fri, 29 Jun 2018 00:22:10 -0700 Marc MERLIN wrote: > On Fri, Jun 29, 2018 at 12:09:54PM +0500, Roman Mamedov wrote: > > On Thu, 28 Jun 2018 23:59:03 -0700 > > Marc MERLIN wrote: > > > > > I don't waste a week recreating the many btrfs send/receive relationships. > > > > Consider not using send/receive, and switching to regular rsync instead. > > Send/receive is very limiting and cumbersome, including because of what you > > described. And it doesn't gain you much over an incremental rsync. As for > > Err, sorry but I cannot agree with you here, at all :) > > btrfs send/receive is pretty much the only reason I use btrfs. > rsync takes hours on big filesystems scanning every single inode on both > sides and then seeing what changed, and only then sends the differences I use it for backing up root filesystems of about 20 hosts, and for syncing large multi-terabyte media collections -- it's fast enough in both. Admittedly neither of those case has millions of subdirs or files where scanning may take a long time. And in the former case it's also all from and to SSDs. Maybe your use case is different where it doesn't work as well. But perhaps then general day-to-day performance is not great either, so I'd suggest looking into SSD-based LVM caching, it really works wonders with Btrfs. -- With respect, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Fri, Jun 29, 2018 at 03:20:42PM +0800, Qu Wenruo wrote: > If certain btrfs specific operations are involved, it's definitely not OK: > 1) Balance > 2) Quota > 3) Btrfs check Ok, I understand. I'll try to balance almost never then. My problems did indeed start because I ran balance and it got stuck 2 days with 0 progress. That still seems like a bug though. I'm ok with slow, but stuck for 2 days with only 270 snapshots or so means there is a bug, or the algorithm is so expensive that 270 snapshots could cause it to take days or weeks to proceed? > > It's a backup server, it only contains data from other machines. > > If the filesystem cannot be recovered to a working state, I will need > > over a week to restart the many btrfs send commands from many servers. > > This is why anything other than --repair is useless ot me, I don't need > > the data back, it's still on the original machines, I need the > > filesystem to work again so that I don't waste a week recreating the > > many btrfs send/receive relationships. > > Now totally understand why you need to repair the fs. I also understand that my use case is atypical :) But I guess this also means that using btrfs for a lot of send/receive on a backup server is not going to work well unfortunately :-/ Now I'm wondering if I'm the only person even doing this. > > Does the pastebin help and is 270 snapshots ok enough? > > The super dump doesn't show anything wrong. > > So the problem may be in the super large extent tree. > > In this case, plain check result with Su's patch would help more, other > than the not so interesting super dump. First I tried to mount with skip balance after the partial repair, and it hung a long time: [445635.716318] BTRFS info (device dm-2): disk space caching is enabled [445635.736229] BTRFS info (device dm-2): has skinny extents [445636.101999] BTRFS info (device dm-2): bdev /dev/mapper/dshelf2 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 [445825.053205] BTRFS info (device dm-2): enabling ssd optimizations [446511.006588] BTRFS info (device dm-2): disk space caching is enabled [446511.026737] BTRFS info (device dm-2): has skinny extents [446511.325470] BTRFS info (device dm-2): bdev /dev/mapper/dshelf2 errs: wr 0, rd 0, flush 0, corrupt 2, gen 0 [446699.593501] BTRFS info (device dm-2): enabling ssd optimizations [446964.077045] INFO: task btrfs-transacti:9211 blocked for more than 120 seconds. [446964.099802] Not tainted 4.17.2-amd64-preempt-sysrq-20180818 #3 [446964.120004] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. So, I rebooted, and will now run Su's btrfs check without repair and report back. Thanks both for your help. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Fri, Jun 29, 2018 at 12:09:54PM +0500, Roman Mamedov wrote: > On Thu, 28 Jun 2018 23:59:03 -0700 > Marc MERLIN wrote: > > > I don't waste a week recreating the many btrfs send/receive relationships. > > Consider not using send/receive, and switching to regular rsync instead. > Send/receive is very limiting and cumbersome, including because of what you > described. And it doesn't gain you much over an incremental rsync. As for Err, sorry but I cannot agree with you here, at all :) btrfs send/receive is pretty much the only reason I use btrfs. rsync takes hours on big filesystems scanning every single inode on both sides and then seeing what changed, and only then sends the differences It's super inefficient. btrfs send knows in seconds what needs to be sent, and works on it right away. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 2018年06月29日 14:59, Marc MERLIN wrote: > On Fri, Jun 29, 2018 at 02:29:10PM +0800, Qu Wenruo wrote: >>> If --repair doesn't work, check is useless to me sadly. >> >> Not exactly. >> Although it's time consuming, I have manually patched several users fs, >> which normally ends pretty well. > > Ok I understand now. > >>> Agreed, I doubt I have over or much over 100 snapshots though (but I >>> can't check right now). >>> Sadly I'm not allowed to mount even read only while check is running: >>> gargamel:~# mount -o ro /dev/mapper/dshelf2 /mnt/mnt2 >>> mount: /dev/mapper/dshelf2 already mounted or /mnt/mnt2 busy > > Ok, so I just checked now, 270 snapshots, but not because I'm crazy, > because I use btrfs send a lot :) > >> This looks like super block corruption? >> >> What about "btrfs inspect dump-super -fFa /dev/mapper/dshelf2"? > > Sure, there you go: https://pastebin.com/uF1pHTsg > >> And what about "skip_balance" mount option? > > I have this in my fstab :) > >> Another problem is, with so many snapshots, balance is also hugely >> slowed, thus I'm not 100% sure if it's really a hang. > > I sent another thread about this last week, balance got hung after 2 > days of doing nothing and just moving a single chunk. > > Ok, I was able to remount the filesystem read only. I was wrong, I have > 270 snapshots: > gargamel:/mnt/mnt# btrfs subvolume list . | grep -c 'path backup/' > 74 > gargamel:/mnt/mnt# btrfs subvolume list . | grep -c 'path backup-btrfssend/' > 196 > > It's a backup server, I use btrfs send for many machines and for each btrs > send, I keep history, maybe 10 or so backups. So it adds up in the end. > > Is btrfs unable to deal with this well enough? It depends. For certain and rare case, if the only operations to the filesystem are non-btrfs specific operations (POSIX file operations), then you're fine. (Maybe you can go thousands snapshots before any obvious performance degrade) If certain btrfs specific operations are involved, it's definitely not OK: 1) Balance 2) Quota 3) Btrfs check > >> If for that usage, btrfs-restore would fit your use case more, >> Unfortunately it needs extra disk space and isn't good at restoring >> subvolume/snapshots. >> (Although it's much faster than repairing the possible corrupted extent >> tree) > > It's a backup server, it only contains data from other machines. > If the filesystem cannot be recovered to a working state, I will need > over a week to restart the many btrfs send commands from many servers. > This is why anything other than --repair is useless ot me, I don't need > the data back, it's still on the original machines, I need the > filesystem to work again so that I don't waste a week recreating the > many btrfs send/receive relationships. Now totally understand why you need to repair the fs. > >>> Is that possible at all? >> >> At least for file recovery (fs tree repair), we have such behavior. >> >> However, the problem you hit (and a lot of users hit) is all about >> extent tree repair, which doesn't even goes to file recovery. >> >> All the hassle are in extent tree, and for extent tree, it's just good >> or bad. Any corruption in extent tree may lead to later bugs. >> The only way to avoid extent tree problems is to mount the fs RO. >> >> So, I'm afraid it is at least impossible for recent years. > > Understood, thanks for answering. > > Does the pastebin help and is 270 snapshots ok enough? The super dump doesn't show anything wrong. So the problem may be in the super large extent tree. In this case, plain check result with Su's patch would help more, other than the not so interesting super dump. Thanks, Qu > > Thanks, > Marc > signature.asc Description: OpenPGP digital signature
Re: So, does btrfs check lowmem take days? weeks?
On Thu, 28 Jun 2018 23:59:03 -0700 Marc MERLIN wrote: > I don't waste a week recreating the many btrfs send/receive relationships. Consider not using send/receive, and switching to regular rsync instead. Send/receive is very limiting and cumbersome, including because of what you described. And it doesn't gain you much over an incremental rsync. As for snapshots on the backup server, you can either automate making one as soon as a backup has finished, or simply make them once/twice a day, during a period when no backups are ongoing. -- With respect, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Fri, Jun 29, 2018 at 02:29:10PM +0800, Qu Wenruo wrote: > > If --repair doesn't work, check is useless to me sadly. > > Not exactly. > Although it's time consuming, I have manually patched several users fs, > which normally ends pretty well. Ok I understand now. > > Agreed, I doubt I have over or much over 100 snapshots though (but I > > can't check right now). > > Sadly I'm not allowed to mount even read only while check is running: > > gargamel:~# mount -o ro /dev/mapper/dshelf2 /mnt/mnt2 > > mount: /dev/mapper/dshelf2 already mounted or /mnt/mnt2 busy Ok, so I just checked now, 270 snapshots, but not because I'm crazy, because I use btrfs send a lot :) > This looks like super block corruption? > > What about "btrfs inspect dump-super -fFa /dev/mapper/dshelf2"? Sure, there you go: https://pastebin.com/uF1pHTsg > And what about "skip_balance" mount option? I have this in my fstab :) > Another problem is, with so many snapshots, balance is also hugely > slowed, thus I'm not 100% sure if it's really a hang. I sent another thread about this last week, balance got hung after 2 days of doing nothing and just moving a single chunk. Ok, I was able to remount the filesystem read only. I was wrong, I have 270 snapshots: gargamel:/mnt/mnt# btrfs subvolume list . | grep -c 'path backup/' 74 gargamel:/mnt/mnt# btrfs subvolume list . | grep -c 'path backup-btrfssend/' 196 It's a backup server, I use btrfs send for many machines and for each btrs send, I keep history, maybe 10 or so backups. So it adds up in the end. Is btrfs unable to deal with this well enough? > If for that usage, btrfs-restore would fit your use case more, > Unfortunately it needs extra disk space and isn't good at restoring > subvolume/snapshots. > (Although it's much faster than repairing the possible corrupted extent > tree) It's a backup server, it only contains data from other machines. If the filesystem cannot be recovered to a working state, I will need over a week to restart the many btrfs send commands from many servers. This is why anything other than --repair is useless ot me, I don't need the data back, it's still on the original machines, I need the filesystem to work again so that I don't waste a week recreating the many btrfs send/receive relationships. > > Is that possible at all? > > At least for file recovery (fs tree repair), we have such behavior. > > However, the problem you hit (and a lot of users hit) is all about > extent tree repair, which doesn't even goes to file recovery. > > All the hassle are in extent tree, and for extent tree, it's just good > or bad. Any corruption in extent tree may lead to later bugs. > The only way to avoid extent tree problems is to mount the fs RO. > > So, I'm afraid it is at least impossible for recent years. Understood, thanks for answering. Does the pastebin help and is 270 snapshots ok enough? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Fri, Jun 29, 2018 at 02:32:44PM +0800, Su Yue wrote: > > > https://github.com/Damenly/btrfs-progs/tree/tmp1 > > > > Not sure if I undertand that you meant, here. > > > Sorry for my unclear words. > Simply speaking, I suggest you to stop current running check. > Then, clone above branch to compile binary then run > 'btrfs check --mode=lowmem $dev'. I understand, I'll build and try it. > > This filesystem is trash to me and will require over a week to rebuild > > manually if I can't repair it. > > Understood your anxiety, a log of check without '--repair' will help > us to make clear what's wrong with your filesystem. Ok, I'll run your new code without repair and report back. It will likely take over a day though. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 2018年06月29日 14:06, Marc MERLIN wrote: > On Fri, Jun 29, 2018 at 01:48:17PM +0800, Qu Wenruo wrote: >> Just normal btrfs check, and post the output. >> If normal check eats up all your memory, btrfs check --mode=lowmem. > > Does check without --repair eat less RAM? Unfortunately, no. > >> --repair should be considered as the last method. > > If --repair doesn't work, check is useless to me sadly. Not exactly. Although it's time consuming, I have manually patched several users fs, which normally ends pretty well. If it's not a wide-spread problem but some small fatal one, it may be fixed. > I know that for > FS analysis and bug reporting, you want to have the FS without changing > it to something maybe worse, but for my use, if it can't be mounted and > can't be fixed, then it gets deleted which is even worse than check > doing the wrong thing. > >>> The last two ERROR lines took over a day to get generated, so I'm not sure >>> if it's still working, but just slowly. >> >> OK, that explains something. >> >> One extent is referred hundreds times, no wonder it will take a long time. >> >> Just one tip here, there are really too many snapshots/reflinked files. >> It's highly recommended to keep the number of snapshots to a reasonable >> number (lower two digits). >> Although btrfs snapshot is super fast, it puts a lot of pressure on its >> extent tree, so there is no free lunch here. > > Agreed, I doubt I have over or much over 100 snapshots though (but I > can't check right now). > Sadly I'm not allowed to mount even read only while check is running: > gargamel:~# mount -o ro /dev/mapper/dshelf2 /mnt/mnt2 > mount: /dev/mapper/dshelf2 already mounted or /mnt/mnt2 busy > >>> I see. Is there any reasonably easy way to check on this running process? >> >> GDB attach would be good. >> Interrupt and check the inode number if it's checking fs tree. >> Check the extent bytenr number if it's checking extent tree. >> >> But considering how many snapshots there are, it's really hard to determine. >> >> In this case, the super large extent tree is causing a lot of problem, >> maybe it's a good idea to allow btrfs check to skip extent tree check? > > I only see --init-extent-tree in the man page, which option did you have > in mind? That feature is just in my mind, not even implemented yet. > >>> Then again, maybe it already fixed enough that I can mount my filesystem >>> again. >> >> This needs the initial btrfs check report and the kernel messages how it >> fails to mount. > > mount command hangs, kernel does not show anything special outside of disk > access hanging. > > Jun 23 17:23:26 gargamel kernel: [ 341.802696] BTRFS warning (device dm-2): > 'recovery' is deprecated, use 'useback > uproot' instead > Jun 23 17:23:26 gargamel kernel: [ 341.828743] BTRFS info (device dm-2): > trying to use backup root at mount time > Jun 23 17:23:26 gargamel kernel: [ 341.850180] BTRFS info (device dm-2): > disk space caching is enabled > Jun 23 17:23:26 gargamel kernel: [ 341.869014] BTRFS info (device dm-2): has > skinny extents > Jun 23 17:23:26 gargamel kernel: [ 342.206289] BTRFS info (device dm-2): > bdev /dev/mapper/dshelf2 errs: wr 0, rd 0 , flush 0, corrupt 2, gen 0 > Jun 23 17:26:26 gargamel kernel: [ 521.571392] BTRFS info (device dm-2): > enabling ssd optimizations > Jun 23 17:55:58 gargamel kernel: [ 2293.914867] perf: interrupt took too long > (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 79750 > Jun 23 17:56:22 gargamel kernel: [ 2317.718406] BTRFS info (device dm-2): > disk space caching is enabled > Jun 23 17:56:22 gargamel kernel: [ 2317.737277] BTRFS info (device dm-2): has > skinny extents > Jun 23 17:56:22 gargamel kernel: [ 2318.069461] BTRFS info (device dm-2): > bdev /dev/mapper/dshelf2 errs: wr 0, rd 0 , flush 0, corrupt 2, gen 0 > Jun 23 17:59:22 gargamel kernel: [ 2498.256167] BTRFS info (device dm-2): > enabling ssd optimizations > Jun 23 18:05:23 gargamel kernel: [ 2859.107057] BTRFS info (device dm-2): > disk space caching is enabled > Jun 23 18:05:23 gargamel kernel: [ 2859.125883] BTRFS info (device dm-2): has > skinny extents > Jun 23 18:05:24 gargamel kernel: [ 2859.448018] BTRFS info (device dm-2): > bdev /dev/mapper/dshelf2 errs: wr 0, rd 0 , flush 0, corrupt 2, gen 0 This looks like super block corruption? What about "btrfs inspect dump-super -fFa /dev/mapper/dshelf2"? And what about "skip_balance" mount option? Another problem is, with so many snapshots, balance is also hugely slowed, thus I'm not 100% sure if it's really a hang. > Jun 23 18:08:23 gargamel kernel: [ 3039.023305] BTRFS info (device dm-2): > enabling ssd optimizations > Jun 23 18:13:41 gargamel kernel: [ 3356.626037] perf: interrupt took too long > (3143 > 3133), lowering kernel.perf_event_max_sample_rate to 63500 > Jun 23 18:17:23 gargamel kernel: [ 3578.937225] Process accounting resumed > Jun 23 18:33:47 gargamel kernel: [ 4563.356252] JFS: nTxBlock = 8192,
Re: So, does btrfs check lowmem take days? weeks?
On 06/29/2018 02:10 PM, Marc MERLIN wrote: On Fri, Jun 29, 2018 at 02:02:19PM +0800, Su Yue wrote: I have figured out the bug is lowmem check can't deal with shared tree block in reloc tree. The fix is simple, you can try the follow repo: https://github.com/Damenly/btrfs-progs/tree/tmp1 Not sure if I undertand that you meant, here. Sorry for my unclear words. Simply speaking, I suggest you to stop current running check. Then, clone above branch to compile binary then run 'btrfs check --mode=lowmem $dev'. Please run lowmem check "without =--repair" first to be sure whether your filesystem is fine. The filesystem is not fine, it caused btrfs balance to hang, whether balance actually broke it further or caused the breakage, I can't say. Then mount hangs, even with recovery, unless I use ro. This filesystem is trash to me and will require over a week to rebuild manually if I can't repair it. Understood your anxiety, a log of check without '--repair' will help us to make clear what's wrong with your filesystem. Thanks, Su Running check without repair for likely several days just to know that my filesystem is not clear (I already know this) isn't useful :) Or am I missing something? Though the bug and phenomenon are clear enough, before sending my patch, I have to make a test image. I have spent a week to study btrfs balance but it seems a liitle hard for me. thanks for having a look, either way. Marc -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Fri, Jun 29, 2018 at 02:02:19PM +0800, Su Yue wrote: > I have figured out the bug is lowmem check can't deal with shared tree block > in reloc tree. The fix is simple, you can try the follow repo: > > https://github.com/Damenly/btrfs-progs/tree/tmp1 Not sure if I undertand that you meant, here. > Please run lowmem check "without =--repair" first to be sure whether > your filesystem is fine. The filesystem is not fine, it caused btrfs balance to hang, whether balance actually broke it further or caused the breakage, I can't say. Then mount hangs, even with recovery, unless I use ro. This filesystem is trash to me and will require over a week to rebuild manually if I can't repair it. Running check without repair for likely several days just to know that my filesystem is not clear (I already know this) isn't useful :) Or am I missing something? > Though the bug and phenomenon are clear enough, before sending my patch, > I have to make a test image. I have spent a week to study btrfs balance > but it seems a liitle hard for me. thanks for having a look, either way. Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Fri, Jun 29, 2018 at 01:48:17PM +0800, Qu Wenruo wrote: > Just normal btrfs check, and post the output. > If normal check eats up all your memory, btrfs check --mode=lowmem. Does check without --repair eat less RAM? > --repair should be considered as the last method. If --repair doesn't work, check is useless to me sadly. I know that for FS analysis and bug reporting, you want to have the FS without changing it to something maybe worse, but for my use, if it can't be mounted and can't be fixed, then it gets deleted which is even worse than check doing the wrong thing. > > The last two ERROR lines took over a day to get generated, so I'm not sure > > if it's still working, but just slowly. > > OK, that explains something. > > One extent is referred hundreds times, no wonder it will take a long time. > > Just one tip here, there are really too many snapshots/reflinked files. > It's highly recommended to keep the number of snapshots to a reasonable > number (lower two digits). > Although btrfs snapshot is super fast, it puts a lot of pressure on its > extent tree, so there is no free lunch here. Agreed, I doubt I have over or much over 100 snapshots though (but I can't check right now). Sadly I'm not allowed to mount even read only while check is running: gargamel:~# mount -o ro /dev/mapper/dshelf2 /mnt/mnt2 mount: /dev/mapper/dshelf2 already mounted or /mnt/mnt2 busy > > I see. Is there any reasonably easy way to check on this running process? > > GDB attach would be good. > Interrupt and check the inode number if it's checking fs tree. > Check the extent bytenr number if it's checking extent tree. > > But considering how many snapshots there are, it's really hard to determine. > > In this case, the super large extent tree is causing a lot of problem, > maybe it's a good idea to allow btrfs check to skip extent tree check? I only see --init-extent-tree in the man page, which option did you have in mind? > > Then again, maybe it already fixed enough that I can mount my filesystem > > again. > > This needs the initial btrfs check report and the kernel messages how it > fails to mount. mount command hangs, kernel does not show anything special outside of disk access hanging. Jun 23 17:23:26 gargamel kernel: [ 341.802696] BTRFS warning (device dm-2): 'recovery' is deprecated, use 'useback uproot' instead Jun 23 17:23:26 gargamel kernel: [ 341.828743] BTRFS info (device dm-2): trying to use backup root at mount time Jun 23 17:23:26 gargamel kernel: [ 341.850180] BTRFS info (device dm-2): disk space caching is enabled Jun 23 17:23:26 gargamel kernel: [ 341.869014] BTRFS info (device dm-2): has skinny extents Jun 23 17:23:26 gargamel kernel: [ 342.206289] BTRFS info (device dm-2): bdev /dev/mapper/dshelf2 errs: wr 0, rd 0 , flush 0, corrupt 2, gen 0 Jun 23 17:26:26 gargamel kernel: [ 521.571392] BTRFS info (device dm-2): enabling ssd optimizations Jun 23 17:55:58 gargamel kernel: [ 2293.914867] perf: interrupt took too long (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 79750 Jun 23 17:56:22 gargamel kernel: [ 2317.718406] BTRFS info (device dm-2): disk space caching is enabled Jun 23 17:56:22 gargamel kernel: [ 2317.737277] BTRFS info (device dm-2): has skinny extents Jun 23 17:56:22 gargamel kernel: [ 2318.069461] BTRFS info (device dm-2): bdev /dev/mapper/dshelf2 errs: wr 0, rd 0 , flush 0, corrupt 2, gen 0 Jun 23 17:59:22 gargamel kernel: [ 2498.256167] BTRFS info (device dm-2): enabling ssd optimizations Jun 23 18:05:23 gargamel kernel: [ 2859.107057] BTRFS info (device dm-2): disk space caching is enabled Jun 23 18:05:23 gargamel kernel: [ 2859.125883] BTRFS info (device dm-2): has skinny extents Jun 23 18:05:24 gargamel kernel: [ 2859.448018] BTRFS info (device dm-2): bdev /dev/mapper/dshelf2 errs: wr 0, rd 0 , flush 0, corrupt 2, gen 0 Jun 23 18:08:23 gargamel kernel: [ 3039.023305] BTRFS info (device dm-2): enabling ssd optimizations Jun 23 18:13:41 gargamel kernel: [ 3356.626037] perf: interrupt took too long (3143 > 3133), lowering kernel.perf_event_max_sample_rate to 63500 Jun 23 18:17:23 gargamel kernel: [ 3578.937225] Process accounting resumed Jun 23 18:33:47 gargamel kernel: [ 4563.356252] JFS: nTxBlock = 8192, nTxLock = 65536 Jun 23 18:33:48 gargamel kernel: [ 4563.446715] ntfs: driver 2.1.32 [Flags: R/W MODULE]. Jun 23 18:42:20 gargamel kernel: [ 5075.995254] INFO: task sync:20253 blocked for more than 120 seconds. Jun 23 18:42:20 gargamel kernel: [ 5076.015729] Not tainted 4.17.2-amd64-preempt-sysrq-20180817 #1 Jun 23 18:42:20 gargamel kernel: [ 5076.036141] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Jun 23 18:42:20 gargamel kernel: [ 5076.060637] syncD0 20253 15327 0x20020080 Jun 23 18:42:20 gargamel kernel: [ 5076.078032] Call Trace: Jun 23 18:42:20 gargamel kernel: [ 5076.086366] ? __schedule+0x53e/0x59b Jun 23 18:42:20 gargamel kernel: [ 5076.098311] schedule+0x7f/0x98
Re: So, does btrfs check lowmem take days? weeks?
On 06/29/2018 01:28 PM, Marc MERLIN wrote: On Fri, Jun 29, 2018 at 01:07:20PM +0800, Qu Wenruo wrote: lowmem repair seems to be going still, but it's been days and -p seems to do absolutely nothing. I'm a afraid you hit a bug in lowmem repair code. By all means, --repair shouldn't really be used unless you're pretty sure the problem is something btrfs check can handle. That's also why --repair is still marked as dangerous. Especially when it's combined with experimental lowmem mode. Understood, but btrfs got corrupted (by itself or not, I don't know) I cannot mount the filesystem read/write I cannot btrfs check --repair it since that code will kill my machine What do I have left? My filesystem is "only" 10TB or so, albeit with a lot of files. Unless you have tons of snapshots and reflinked (deduped) files, it shouldn't take so long. I may have a fair amount. gargamel:~# btrfs check --mode=lowmem --repair -p /dev/mapper/dshelf2 enabling repair mode WARNING: low-memory mode repair support is only partial Checking filesystem on /dev/mapper/dshelf2 UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d Fixed 0 roots. ERROR: extent[84302495744, 69632] referencer count mismatch (root: 21872, owner: 374857, offset: 3407872) wanted: 3, have: 4 Created new chunk [18457780224000 1073741824] Delete backref in extent [84302495744 69632] ERROR: extent[84302495744, 69632] referencer count mismatch (root: 22911, owner: 374857, offset: 3407872) wanted: 3, have: 4 Delete backref in extent [84302495744 69632] ERROR: extent[125712527360, 12214272] referencer count mismatch (root: 21872, owner: 374857, offset: 114540544) wanted: 181, have: 240 Delete backref in extent [125712527360 12214272] ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 21872, owner: 374857, offset: 126754816) wanted: 68, have: 115 Delete backref in extent [125730848768 5111808] ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 22911, owner: 374857, offset: 126754816) wanted: 68, have: 115 Delete backref in extent [125730848768 5111808] ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 21872, owner: 374857, offset: 131866624) wanted: 115, have: 143 Delete backref in extent [125736914944 6037504] ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 22911, owner: 374857, offset: 131866624) wanted: 115, have: 143 Delete backref in extent [125736914944 6037504] ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 21872, owner: 374857, offset: 148234240) wanted: 302, have: 431 Delete backref in extent [129952120832 20242432] ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 22911, owner: 374857, offset: 148234240) wanted: 356, have: 433 Delete backref in extent [129952120832 20242432] ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 21872, owner: 374857, offset: 180371456) wanted: 161, have: 240 Delete backref in extent [134925357056 11829248] ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 22911, owner: 374857, offset: 180371456) wanted: 162, have: 240 Delete backref in extent [134925357056 11829248] ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 21872, owner: 374857, offset: 192200704) wanted: 170, have: 249 Delete backref in extent [147895111680 12345344] ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 22911, owner: 374857, offset: 192200704) wanted: 172, have: 251 Delete backref in extent [147895111680 12345344] ERROR: extent[150850146304, 17522688] referencer count mismatch (root: 21872, owner: 374857, offset: 217653248) wanted: 348, have: 418 Delete backref in extent [150850146304 17522688] ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 22911, owner: 374857, offset: 235175936) wanted: 555, have: 1449 Deleted root 2 item[156909494272, 178, 5476627808561673095] ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, owner: 374857, offset: 235175936) wanted: 556, have: 1452 Deleted root 2 item[156909494272, 178, 7338474132555182983] ERROR: file extent[374857 235184128] root 21872 owner 21872 backref lost Add one extent data backref [156909494272 55320576] ERROR: file extent[374857 235184128] root 22911 owner 22911 backref lost Add one extent data backref [156909494272 55320576] My bad. It's almost possiblelly a bug about extent of lowmem check which was reported by Chris too. The extent check was wrong, the the repair did wrong things. I have figured out the bug is lowmem check can't deal with shared tree block in reloc tree. The fix is simple, you can try the follow repo: https://github.com/Damenly/btrfs-progs/tree/tmp1 Please run lowmem check "without =--repair" first to be sure whether your filesystem is fine. Though the bug and phenomenon are clear enough, before sending my patch, I have to make a test image. I have spent a week to study btrfs balance but it seems a liitle
Re: So, does btrfs check lowmem take days? weeks?
On 2018年06月29日 13:28, Marc MERLIN wrote: > On Fri, Jun 29, 2018 at 01:07:20PM +0800, Qu Wenruo wrote: >>> lowmem repair seems to be going still, but it's been days and -p seems >>> to do absolutely nothing. >> >> I'm a afraid you hit a bug in lowmem repair code. >> By all means, --repair shouldn't really be used unless you're pretty >> sure the problem is something btrfs check can handle. >> >> That's also why --repair is still marked as dangerous. >> Especially when it's combined with experimental lowmem mode. > > Understood, but btrfs got corrupted (by itself or not, I don't know) > I cannot mount the filesystem read/write > I cannot btrfs check --repair it since that code will kill my machine > What do I have left? Just normal btrfs check, and post the output. If normal check eats up all your memory, btrfs check --mode=lowmem. --repair should be considered as the last method. > >>> My filesystem is "only" 10TB or so, albeit with a lot of files. >> >> Unless you have tons of snapshots and reflinked (deduped) files, it >> shouldn't take so long. > > I may have a fair amount. > gargamel:~# btrfs check --mode=lowmem --repair -p /dev/mapper/dshelf2 > enabling repair mode > WARNING: low-memory mode repair support is only partial > Checking filesystem on /dev/mapper/dshelf2 > UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d > Fixed 0 roots. > ERROR: extent[84302495744, 69632] referencer count mismatch (root: 21872, > owner: 374857, offset: 3407872) wanted: 3, have: 4 > Created new chunk [18457780224000 1073741824] > Delete backref in extent [84302495744 69632] > ERROR: extent[84302495744, 69632] referencer count mismatch (root: 22911, > owner: 374857, offset: 3407872) wanted: 3, have: 4 > Delete backref in extent [84302495744 69632] > ERROR: extent[125712527360, 12214272] referencer count mismatch (root: 21872, > owner: 374857, offset: 114540544) wanted: 181, have: 240 > Delete backref in extent [125712527360 12214272] > ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 21872, > owner: 374857, offset: 126754816) wanted: 68, have: 115 > Delete backref in extent [125730848768 5111808] > ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 22911, > owner: 374857, offset: 126754816) wanted: 68, have: 115 > Delete backref in extent [125730848768 5111808] > ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 21872, > owner: 374857, offset: 131866624) wanted: 115, have: 143 > Delete backref in extent [125736914944 6037504] > ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 22911, > owner: 374857, offset: 131866624) wanted: 115, have: 143 > Delete backref in extent [125736914944 6037504] > ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 21872, > owner: 374857, offset: 148234240) wanted: 302, have: 431 > Delete backref in extent [129952120832 20242432] > ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 22911, > owner: 374857, offset: 148234240) wanted: 356, have: 433 > Delete backref in extent [129952120832 20242432] > ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 21872, > owner: 374857, offset: 180371456) wanted: 161, have: 240 > Delete backref in extent [134925357056 11829248] > ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 22911, > owner: 374857, offset: 180371456) wanted: 162, have: 240 > Delete backref in extent [134925357056 11829248] > ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 21872, > owner: 374857, offset: 192200704) wanted: 170, have: 249 > Delete backref in extent [147895111680 12345344] > ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 22911, > owner: 374857, offset: 192200704) wanted: 172, have: 251 > Delete backref in extent [147895111680 12345344] > ERROR: extent[150850146304, 17522688] referencer count mismatch (root: 21872, > owner: 374857, offset: 217653248) wanted: 348, have: 418 > Delete backref in extent [150850146304 17522688] > ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 22911, > owner: 374857, offset: 235175936) wanted: 555, have: 1449 > Deleted root 2 item[156909494272, 178, 5476627808561673095] > ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, > owner: 374857, offset: 235175936) wanted: 556, have: 1452 > Deleted root 2 item[156909494272, 178, 7338474132555182983] > ERROR: file extent[374857 235184128] root 21872 owner 21872 backref lost > Add one extent data backref [156909494272 55320576] > ERROR: file extent[374857 235184128] root 22911 owner 22911 backref lost > Add one extent data backref [156909494272 55320576] > > The last two ERROR lines took over a day to get generated, so I'm not sure if > it's still working, but just slowly. OK, that explains something. One extent is referred hundreds times, no wonder it will take a long time. Just one tip here, there are really too many
Re: So, does btrfs check lowmem take days? weeks?
On Fri, Jun 29, 2018 at 01:35:06PM +0800, Su Yue wrote: > > It's hard to estimate, especially when every cross check involves a lot > > of disk IO. > > > > But at least, we could add such indicator to show we're doing something. > > Maybe we can account all roots in root tree first, before checking a > tree, report i/num_roots. So users can see the what is the check doing > something meaningful or silly dead looping. Sounds reasonable. Do you want to submit something in git master for btrfs-progs, I pull it, and just my btrfs check again? In the meantime, how sane does the output I just posted, look? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On 06/29/2018 01:07 PM, Qu Wenruo wrote: On 2018年06月29日 12:27, Marc MERLIN wrote: Regular btrfs check --repair has a nice progress option. It wasn't perfect, but it showed something. But then it also takes all your memory quicker than the linux kernel can defend itself and reliably completely kills my 32GB server quicker than it can OOM anything. lowmem repair seems to be going still, but it's been days and -p seems to do absolutely nothing. I'm a afraid you hit a bug in lowmem repair code. By all means, --repair shouldn't really be used unless you're pretty sure the problem is something btrfs check can handle. That's also why --repair is still marked as dangerous. Especially when it's combined with experimental lowmem mode. My filesystem is "only" 10TB or so, albeit with a lot of files. Unless you have tons of snapshots and reflinked (deduped) files, it shouldn't take so long. 2 things that come to mind 1) can lowmem have some progress working so that I know if I'm looking at days, weeks, or even months before it will be done? It's hard to estimate, especially when every cross check involves a lot of disk IO. But at least, we could add such indicator to show we're doing something. Maybe we can account all roots in root tree first, before checking a tree, report i/num_roots. So users can see the what is the check doing something meaningful or silly dead looping. Thanks, Su 2) non lowmem is more efficient obviously when it doesn't completely crash your machine, but could lowmem be given an amount of memory to use for caching, or maybe use some heuristics based on RAM free so that it's not so excrutiatingly slow? IIRC recent commit has added the ability. a5ce5d219822 ("btrfs-progs: extent-cache: actually cache extent buffers") That's already included in btrfs-progs v4.13.2. So it should be a dead loop which lowmem repair code can't handle. Thanks, Qu Thanks, Marc -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: So, does btrfs check lowmem take days? weeks?
On Fri, Jun 29, 2018 at 01:07:20PM +0800, Qu Wenruo wrote: > > lowmem repair seems to be going still, but it's been days and -p seems > > to do absolutely nothing. > > I'm a afraid you hit a bug in lowmem repair code. > By all means, --repair shouldn't really be used unless you're pretty > sure the problem is something btrfs check can handle. > > That's also why --repair is still marked as dangerous. > Especially when it's combined with experimental lowmem mode. Understood, but btrfs got corrupted (by itself or not, I don't know) I cannot mount the filesystem read/write I cannot btrfs check --repair it since that code will kill my machine What do I have left? > > My filesystem is "only" 10TB or so, albeit with a lot of files. > > Unless you have tons of snapshots and reflinked (deduped) files, it > shouldn't take so long. I may have a fair amount. gargamel:~# btrfs check --mode=lowmem --repair -p /dev/mapper/dshelf2 enabling repair mode WARNING: low-memory mode repair support is only partial Checking filesystem on /dev/mapper/dshelf2 UUID: 0f1a0c9f-4e54-4fa7-8736-fd50818ff73d Fixed 0 roots. ERROR: extent[84302495744, 69632] referencer count mismatch (root: 21872, owner: 374857, offset: 3407872) wanted: 3, have: 4 Created new chunk [18457780224000 1073741824] Delete backref in extent [84302495744 69632] ERROR: extent[84302495744, 69632] referencer count mismatch (root: 22911, owner: 374857, offset: 3407872) wanted: 3, have: 4 Delete backref in extent [84302495744 69632] ERROR: extent[125712527360, 12214272] referencer count mismatch (root: 21872, owner: 374857, offset: 114540544) wanted: 181, have: 240 Delete backref in extent [125712527360 12214272] ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 21872, owner: 374857, offset: 126754816) wanted: 68, have: 115 Delete backref in extent [125730848768 5111808] ERROR: extent[125730848768, 5111808] referencer count mismatch (root: 22911, owner: 374857, offset: 126754816) wanted: 68, have: 115 Delete backref in extent [125730848768 5111808] ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 21872, owner: 374857, offset: 131866624) wanted: 115, have: 143 Delete backref in extent [125736914944 6037504] ERROR: extent[125736914944, 6037504] referencer count mismatch (root: 22911, owner: 374857, offset: 131866624) wanted: 115, have: 143 Delete backref in extent [125736914944 6037504] ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 21872, owner: 374857, offset: 148234240) wanted: 302, have: 431 Delete backref in extent [129952120832 20242432] ERROR: extent[129952120832, 20242432] referencer count mismatch (root: 22911, owner: 374857, offset: 148234240) wanted: 356, have: 433 Delete backref in extent [129952120832 20242432] ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 21872, owner: 374857, offset: 180371456) wanted: 161, have: 240 Delete backref in extent [134925357056 11829248] ERROR: extent[134925357056, 11829248] referencer count mismatch (root: 22911, owner: 374857, offset: 180371456) wanted: 162, have: 240 Delete backref in extent [134925357056 11829248] ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 21872, owner: 374857, offset: 192200704) wanted: 170, have: 249 Delete backref in extent [147895111680 12345344] ERROR: extent[147895111680, 12345344] referencer count mismatch (root: 22911, owner: 374857, offset: 192200704) wanted: 172, have: 251 Delete backref in extent [147895111680 12345344] ERROR: extent[150850146304, 17522688] referencer count mismatch (root: 21872, owner: 374857, offset: 217653248) wanted: 348, have: 418 Delete backref in extent [150850146304 17522688] ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 22911, owner: 374857, offset: 235175936) wanted: 555, have: 1449 Deleted root 2 item[156909494272, 178, 5476627808561673095] ERROR: extent[156909494272, 55320576] referencer count mismatch (root: 21872, owner: 374857, offset: 235175936) wanted: 556, have: 1452 Deleted root 2 item[156909494272, 178, 7338474132555182983] ERROR: file extent[374857 235184128] root 21872 owner 21872 backref lost Add one extent data backref [156909494272 55320576] ERROR: file extent[374857 235184128] root 22911 owner 22911 backref lost Add one extent data backref [156909494272 55320576] The last two ERROR lines took over a day to get generated, so I'm not sure if it's still working, but just slowly. For what it's worth non lowmem check used to take 12 to 24H on that filesystem back when it still worked. > > 2 things that come to mind > > 1) can lowmem have some progress working so that I know if I'm looking > > at days, weeks, or even months before it will be done? > > It's hard to estimate, especially when every cross check involves a lot > of disk IO. > But at least, we could add such indicator to show we're doing something. Yes, anything to show that I should still wait is still good :) > > 2) non lowmem
Re: So, does btrfs check lowmem take days? weeks?
On 2018年06月29日 12:27, Marc MERLIN wrote: > Regular btrfs check --repair has a nice progress option. It wasn't > perfect, but it showed something. > > But then it also takes all your memory quicker than the linux kernel can > defend itself and reliably completely kills my 32GB server quicker than > it can OOM anything. > > lowmem repair seems to be going still, but it's been days and -p seems > to do absolutely nothing. I'm a afraid you hit a bug in lowmem repair code. By all means, --repair shouldn't really be used unless you're pretty sure the problem is something btrfs check can handle. That's also why --repair is still marked as dangerous. Especially when it's combined with experimental lowmem mode. > > My filesystem is "only" 10TB or so, albeit with a lot of files. Unless you have tons of snapshots and reflinked (deduped) files, it shouldn't take so long. > > 2 things that come to mind > 1) can lowmem have some progress working so that I know if I'm looking > at days, weeks, or even months before it will be done? It's hard to estimate, especially when every cross check involves a lot of disk IO. But at least, we could add such indicator to show we're doing something. > > 2) non lowmem is more efficient obviously when it doesn't completely > crash your machine, but could lowmem be given an amount of memory to use > for caching, or maybe use some heuristics based on RAM free so that it's > not so excrutiatingly slow? IIRC recent commit has added the ability. a5ce5d219822 ("btrfs-progs: extent-cache: actually cache extent buffers") That's already included in btrfs-progs v4.13.2. So it should be a dead loop which lowmem repair code can't handle. Thanks, Qu > > Thanks, > Marc > signature.asc Description: OpenPGP digital signature
So, does btrfs check lowmem take days? weeks?
Regular btrfs check --repair has a nice progress option. It wasn't perfect, but it showed something. But then it also takes all your memory quicker than the linux kernel can defend itself and reliably completely kills my 32GB server quicker than it can OOM anything. lowmem repair seems to be going still, but it's been days and -p seems to do absolutely nothing. My filesystem is "only" 10TB or so, albeit with a lot of files. 2 things that come to mind 1) can lowmem have some progress working so that I know if I'm looking at days, weeks, or even months before it will be done? 2) non lowmem is more efficient obviously when it doesn't completely crash your machine, but could lowmem be given an amount of memory to use for caching, or maybe use some heuristics based on RAM free so that it's not so excrutiatingly slow? Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html