Possible deadlock when writing
I started having a host freeze randomly when running a 4.18 kernel. The host was stable when running 4.17.12. At first, it appeared that it was only IO that was frozen since I could run common commands that were likely cached in RAM and that did not touch storage. Anything that did touch storage would freeze and I would not be able to ctrl-c it. I noticed today, when it happened with kernel 4.19.2, that backups were still running and that the backup app could still read from the backup snapshot subvol. It's possible that the backups are still able to proceed because the accesses are all read-only and the snapshot was mounted with noatime so the backup process never triggers a write. There never are any errors output to the console when this happens and nothing is logged. When I first encountered this back in Sept. I managed to record a few sysrq dumps and attached them to a redhat ticket. See links below. https://bugzilla.redhat.com/show_bug.cgi?id=1627288 https://bugzilla.redhat.com/attachment.cgi?id=1482177 I do have several VMs running that have their image files nocow'd. Interestingly, all the VMs, except 1, seem to be able to write just fine. The one that can't has frozen completely and is the one that regularly generates the most IO. Any ideas on how to debug this further? --Larkin
Re: Scrub aborts due to corrupt leaf
On 10/10/2018 10:51 PM, Chris Murphy wrote: On Wed, Oct 10, 2018 at 8:12 PM, Larkin Lowrey wrote: On 10/10/2018 7:55 PM, Hans van Kranenburg wrote: On 10/10/2018 07:44 PM, Chris Murphy wrote: I'm pretty sure you have to umount, and then clear the space_cache with 'btrfs check --clear-space-cache=v1' and then do a one time mount with -o space_cache=v2. The --clear-space-cache=v1 is optional, but recommended, if you are someone who do not likes to keep accumulated cruft. The v2 mount (rw mount!!!) does not remove the v1 cache. If you just mount with v2, the v1 data keeps being there, doing nothing any more. Theoretically I have the v2 space_cache enabled. After a clean umount... # mount -onospace_cache /backups [ 391.243175] BTRFS info (device dm-3): disabling free space tree [ 391.249213] BTRFS error (device dm-3): cannot disable free space tree [ 391.255884] BTRFS error (device dm-3): open_ctree failed "free space tree" is the v2 space cache, and once enabled it cannot be disabled with nospace_cache mount option. If you want to run with nospace_cache you'll need to clear it. # mount -ospace_cache=v1 /backups/ mount: /backups: wrong fs type, bad option, bad superblock on /dev/mapper/Cached-Backups, missing codepage or helper program, or other error [ 983.501874] BTRFS info (device dm-3): enabling disk space caching [ 983.508052] BTRFS error (device dm-3): cannot disable free space tree [ 983.514633] BTRFS error (device dm-3): open_ctree failed You cannot go back and forth between v1 and v2. Once v2 is enabled, it's always used regardless of any mount option. You'll need to use btrfs check to clear the v2 cache if you want to use v1 cache. # btrfs check --clear-space-cache v1 /dev/Cached/Backups Opening filesystem to check... couldn't open RDWR because of unsupported option features (3). ERROR: cannot open file system You're missing the '=' symbol for the clear option, that's why it fails. # btrfs check --clear-space-cache=v2 /dev/Cached/Backups Opening filesystem to check... Checking filesystem on /dev/Cached/Backups UUID: acff5096-1128-4b24-a15e-4ba04261edc3 Clear free space cache v2 Segmentation fault (core dumped) [ 109.686188] btrfs[2429]: segfault at 68 ip 555ff6394b1c sp 7ffcc4733ab0 error 4 in btrfs[555ff637c000+ca000] [ 109.696732] Code: ff e8 68 ed ff ff 8b 4c 24 58 4d 8b 8f c7 01 00 00 4c 89 fe 85 c0 0f 44 44 24 40 45 31 c0 89 44 24 40 48 8b 84 24 90 00 00 00 <8b> 40 68 49 29 87 d0 00 00 00 6a 00 55 48 8b 54 24 18 48 8b 7c 24 That's btrfs-progs v4.17.1 on 4.18.12-200.fc28.x86_64. I appreciate the help and advice from everyone who has contributed to this thread. At this point, unless there is something for the project to gain from tracking down this trouble, I'm just going to nuke the fs and start over. --Larkin
Re: Scrub aborts due to corrupt leaf
On 10/10/2018 7:55 PM, Hans van Kranenburg wrote: On 10/10/2018 07:44 PM, Chris Murphy wrote: I'm pretty sure you have to umount, and then clear the space_cache with 'btrfs check --clear-space-cache=v1' and then do a one time mount with -o space_cache=v2. The --clear-space-cache=v1 is optional, but recommended, if you are someone who do not likes to keep accumulated cruft. The v2 mount (rw mount!!!) does not remove the v1 cache. If you just mount with v2, the v1 data keeps being there, doing nothing any more. Theoretically I have the v2 space_cache enabled. After a clean umount... # mount -onospace_cache /backups [ 391.243175] BTRFS info (device dm-3): disabling free space tree [ 391.249213] BTRFS error (device dm-3): cannot disable free space tree [ 391.255884] BTRFS error (device dm-3): open_ctree failed # mount -ospace_cache=v1 /backups/ mount: /backups: wrong fs type, bad option, bad superblock on /dev/mapper/Cached-Backups, missing codepage or helper program, or other error [ 983.501874] BTRFS info (device dm-3): enabling disk space caching [ 983.508052] BTRFS error (device dm-3): cannot disable free space tree [ 983.514633] BTRFS error (device dm-3): open_ctree failed # btrfs check --clear-space-cache v1 /dev/Cached/Backups Opening filesystem to check... couldn't open RDWR because of unsupported option features (3). ERROR: cannot open file system # btrfs --version btrfs-progs v4.17.1 # mount /backups/ [ 1036.840637] BTRFS info (device dm-3): using free space tree [ 1036.846272] BTRFS info (device dm-3): has skinny extents [ 1036.999456] BTRFS info (device dm-3): bdev /dev/mapper/Cached-Backups errs: wr 0, rd 0, flush 0, corrupt 666, gen 25 [ 1043.025076] BTRFS info (device dm-3): enabling ssd optimizations Backups will run tonight and will beat on the FS. Perhaps if something interesting happens I'll have more log data. --Larkin
Re: Scrub aborts due to corrupt leaf
On 10/10/2018 2:20 PM, Holger Hoffstätte wrote: On 10/10/18 19:25, Larkin Lowrey wrote: On 10/10/2018 12:04 PM, Holger Hoffstätte wrote: On 10/10/18 17:44, Larkin Lowrey wrote: (..) About once a week, or so, I'm running into the above situation where FS seems to deadlock. All IO to the FS blocks, there is no IO activity at all. I have to hard reboot the system to recover. There are no error indications except for the following which occurs well before the FS freezes up: BTRFS warning (device dm-3): block group 78691883286528 has wrong amount of free space BTRFS warning (device dm-3): failed to load free space cache for block group 78691883286528, rebuilding it now Do I have any options other the nuking the FS and starting over? Unmount cleanly & mount again with -o space_cache=v2. It froze while unmounting. The attached zip is a stack dump captured via 'echo t > /proc/sysrq-trigger'. A second attempt after a hard reboot worked. Trace says freespace cache writeout failed midway while the scsi device was resetting itself and then went rrrghh. Probably managed to hit different blocks on the second attempt. So chances are your controller, disk or something else is broken, dying, or both. When things have settled and you have verified that r/o mounting works and is stable, try rescuing the data (when necessary) before scrubbing, dm-device-checking or whatever you have set up. Interesting, because I do not see any indications of any other errors. The fs is backed by an mdraid array and the raid checks always pass with no mismatches, edac-util doesn't report any ECC errors, smartd doesn't report any SMART errors, and I never see any raid controller errors. I have the console connected through serial to a logging console server so if there were errors reported I would have seen them. --Larkin
Re: Scrub aborts due to corrupt leaf
On 10/10/2018 12:04 PM, Holger Hoffstätte wrote: On 10/10/18 17:44, Larkin Lowrey wrote: (..) About once a week, or so, I'm running into the above situation where FS seems to deadlock. All IO to the FS blocks, there is no IO activity at all. I have to hard reboot the system to recover. There are no error indications except for the following which occurs well before the FS freezes up: BTRFS warning (device dm-3): block group 78691883286528 has wrong amount of free space BTRFS warning (device dm-3): failed to load free space cache for block group 78691883286528, rebuilding it now Do I have any options other the nuking the FS and starting over? Unmount cleanly & mount again with -o space_cache=v2. It froze while unmounting. The attached zip is a stack dump captured via 'echo t > /proc/sysrq-trigger'. A second attempt after a hard reboot worked. --Larkin <>
Re: Scrub aborts due to corrupt leaf
On 9/11/2018 11:23 AM, Larkin Lowrey wrote: On 8/29/2018 1:32 AM, Qu Wenruo wrote: On 2018/8/28 下午9:56, Chris Murphy wrote: On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo wrote: On 2018/8/28 下午9:29, Larkin Lowrey wrote: On 8/27/2018 10:12 PM, Larkin Lowrey wrote: On 8/27/2018 12:46 AM, Qu Wenruo wrote: The system uses ECC memory and edac-util has not reported any errors. However, I will run a memtest anyway. So it should not be the memory problem. BTW, what's the current generation of the fs? # btrfs inspect dump-super | grep generation The corrupted leaf has generation 2862, I'm not sure how recent did the corruption happen. generation 358392 chunk_root_generation 357256 cache_generation 358392 uuid_tree_generation 358392 dev_item.generation 0 I don't recall the last time I ran a scrub but I doubt it has been more than a year. I am running 'btrfs check --init-csum-tree' now. Hopefully that clears everything up. No such luck: Creating a new CRC tree Checking filesystem on /dev/Cached/Backups UUID: acff5096-1128-4b24-a15e-4ba04261edc3 Reinitialize checksum tree csum result is 0 for block 2412149436416 extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, value -28 It's ENOSPC, meaning btrfs can't find enough space for the new csum tree blocks. Seems bogus, there's >4TiB unallocated. What a shame. Btrfs won't try to allocate new chunk if we're allocating new tree blocks for metadata trees (extent, csum, etc). One quick (and dirty) way to avoid such limitation is to use the following patch <> No luck. # ./btrfs check --init-csum-tree /dev/Cached/Backups Creating a new CRC tree Opening filesystem to check... Checking filesystem on /dev/Cached/Backups UUID: acff5096-1128-4b24-a15e-4ba04261edc3 Reinitialize checksum tree Segmentation fault (core dumped) btrfs[16575]: segfault at 7ffc4f74ef60 ip 0040d4c3 sp 7ffc4f74ef50 error 6 in btrfs[40+bf000] # ./btrfs --version btrfs-progs v4.17.1 I cloned btrfs-progs from git and applied your patch. BTW, I've been having tons of trouble with two hosts after updating from kernel 4.17.12 to 4.17.14 and beyond. The fs will become unresponsive and all processes will end up stuck waiting on io. The system will end up totally idle but unable perform any io on the filesystem. So far things have been stable after reverting back to 4.17.12. It looks like there was a btrfs change in 4.17.13. Could that be related to this csum tree corruption? About once a week, or so, I'm running into the above situation where FS seems to deadlock. All IO to the FS blocks, there is no IO activity at all. I have to hard reboot the system to recover. There are no error indications except for the following which occurs well before the FS freezes up: BTRFS warning (device dm-3): block group 78691883286528 has wrong amount of free space BTRFS warning (device dm-3): failed to load free space cache for block group 78691883286528, rebuilding it now Do I have any options other the nuking the FS and starting over? --Larkin
Re: Scrub aborts due to corrupt leaf
On 8/29/2018 1:32 AM, Qu Wenruo wrote: On 2018/8/28 下午9:56, Chris Murphy wrote: On Tue, Aug 28, 2018 at 7:42 AM, Qu Wenruo wrote: On 2018/8/28 下午9:29, Larkin Lowrey wrote: On 8/27/2018 10:12 PM, Larkin Lowrey wrote: On 8/27/2018 12:46 AM, Qu Wenruo wrote: The system uses ECC memory and edac-util has not reported any errors. However, I will run a memtest anyway. So it should not be the memory problem. BTW, what's the current generation of the fs? # btrfs inspect dump-super | grep generation The corrupted leaf has generation 2862, I'm not sure how recent did the corruption happen. generation 358392 chunk_root_generation 357256 cache_generation358392 uuid_tree_generation358392 dev_item.generation 0 I don't recall the last time I ran a scrub but I doubt it has been more than a year. I am running 'btrfs check --init-csum-tree' now. Hopefully that clears everything up. No such luck: Creating a new CRC tree Checking filesystem on /dev/Cached/Backups UUID: acff5096-1128-4b24-a15e-4ba04261edc3 Reinitialize checksum tree csum result is 0 for block 2412149436416 extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, value -28 It's ENOSPC, meaning btrfs can't find enough space for the new csum tree blocks. Seems bogus, there's >4TiB unallocated. What a shame. Btrfs won't try to allocate new chunk if we're allocating new tree blocks for metadata trees (extent, csum, etc). One quick (and dirty) way to avoid such limitation is to use the following patch <> No luck. # ./btrfs check --init-csum-tree /dev/Cached/Backups Creating a new CRC tree Opening filesystem to check... Checking filesystem on /dev/Cached/Backups UUID: acff5096-1128-4b24-a15e-4ba04261edc3 Reinitialize checksum tree Segmentation fault (core dumped) btrfs[16575]: segfault at 7ffc4f74ef60 ip 0040d4c3 sp 7ffc4f74ef50 error 6 in btrfs[40+bf000] # ./btrfs --version btrfs-progs v4.17.1 I cloned btrfs-progs from git and applied your patch. BTW, I've been having tons of trouble with two hosts after updating from kernel 4.17.12 to 4.17.14 and beyond. The fs will become unresponsive and all processes will end up stuck waiting on io. The system will end up totally idle but unable perform any io on the filesystem. So far things have been stable after reverting back to 4.17.12. It looks like there was a btrfs change in 4.17.13. Could that be related to this csum tree corruption? --Larkin
Re: Scrub aborts due to corrupt leaf
On 8/27/2018 10:12 PM, Larkin Lowrey wrote: On 8/27/2018 12:46 AM, Qu Wenruo wrote: The system uses ECC memory and edac-util has not reported any errors. However, I will run a memtest anyway. So it should not be the memory problem. BTW, what's the current generation of the fs? # btrfs inspect dump-super | grep generation The corrupted leaf has generation 2862, I'm not sure how recent did the corruption happen. generation 358392 chunk_root_generation 357256 cache_generation 358392 uuid_tree_generation 358392 dev_item.generation 0 I don't recall the last time I ran a scrub but I doubt it has been more than a year. I am running 'btrfs check --init-csum-tree' now. Hopefully that clears everything up. No such luck: Creating a new CRC tree Checking filesystem on /dev/Cached/Backups UUID: acff5096-1128-4b24-a15e-4ba04261edc3 Reinitialize checksum tree csum result is 0 for block 2412149436416 extent-tree.c:2764: alloc_tree_block: BUG_ON `ret` triggered, value -28 btrfs(+0x1da16)[0x55cc43796a16] btrfs(btrfs_alloc_free_block+0x207)[0x55cc4379c177] btrfs(+0x1602f)[0x55cc4378f02f] btrfs(btrfs_search_slot+0xed2)[0x55cc43790be2] btrfs(btrfs_csum_file_block+0x48f)[0x55cc437a213f] btrfs(+0x55cef)[0x55cc437cecef] btrfs(cmd_check+0xd49)[0x55cc437ddbc9] btrfs(main+0x81)[0x55cc4378b4d1] /lib64/libc.so.6(__libc_start_main+0xeb)[0x7f4717e6324b] btrfs(_start+0x2a)[0x55cc4378b5ea] Aborted (core dumped) --Larkin
Re: Scrub aborts due to corrupt leaf
On 8/27/2018 12:46 AM, Qu Wenruo wrote: The system uses ECC memory and edac-util has not reported any errors. However, I will run a memtest anyway. So it should not be the memory problem. BTW, what's the current generation of the fs? # btrfs inspect dump-super | grep generation The corrupted leaf has generation 2862, I'm not sure how recent did the corruption happen. generation 358392 chunk_root_generation 357256 cache_generation 358392 uuid_tree_generation 358392 dev_item.generation 0 I don't recall the last time I ran a scrub but I doubt it has been more than a year. I am running 'btrfs check --init-csum-tree' now. Hopefully that clears everything up. Thank you for your help and advice, --Larkin
Re: Scrub aborts due to corrupt leaf
On 8/26/2018 8:16 PM, Qu Wenruo wrote: Corrupted tree block bytenr matches with the number reported by kernel. You could provide the tree block dump for bytenr 7687860535296, and maybe we could find out what's going wrong and fix it manually. # btrfs ins dump-tree -b 7687860535296 Thank you for your reply. # btrfs ins dump-tree -b 7687860535296 /dev/Cached/Backups btrfs-progs v4.15.1 leaf free space ret -2002721201, leaf data size 16283, used 2002737484 nritems 319 leaf 7687860535296 items 319 free space -2002721201 generation 2862 owner 7 leaf 7687860535296 flags 0x1(WRITTEN) backref revision 1 fs uuid acff5096-1128-4b24-a15e-4ba04261edc3 chunk uuid 0d2fdb5d-00c0-41b3-b2ed-39a5e3bf98aa item 0 key (18446744073650847734 EXTENT_CSUM 8487178285056) itemoff 13211 itemsize 3072 range start 8487178285056 end 8487181430784 length 3145728 item 1 key (18446744073650880502 EXTENT_CSUM 8487174090752) itemoff 10139 itemsize 3072 range start 8487174090752 end 8487177236480 length 3145728 item 2 key (18446744073650913270 EXTENT_CSUM 8487167782912) itemoff 3251 itemsize 6888 range start 8487167782912 end 8487174836224 length 7053312 item 3 key (18446744073651011574 EXTENT_CSUM 8487166103552) itemoff 187 itemsize 3064 range start 8487166103552 end 8487169241088 length 3137536 item 4 key (58523648 UNKNOWN.0 4115587072) itemoff 0 itemsize 0 item 5 key (58523648 UNKNOWN.0 4115058688) itemoff 0 itemsize 0 item 6 key (58392576 UNKNOWN.0 4115050496) itemoff 0 itemsize 0 item 7 key (58392576 UNKNOWN.0 9160800976331685888) itemoff 1325803612 itemsize 1549669347 item 8 key (15706350841398176100 UNKNOWN.160 9836230374950416562) itemoff -507102832 itemsize -1565142843 item 9 key (16420776794030147775 UNKNOWN.139 1413404178631177347) itemoff 319666572 itemsize -2033238481 item 10 key (12490357187492557094 UNKNOWN.100 8703020161114007581) itemoff 1698374107 itemsize 427239449 item 11 key (10238910558655956878 UNKNOWN.145 13172984620675614213) itemoff -1386707845 itemsize -2094889124 item 12 key (14429452134272870167 UNKNOWN.47 5095274587264087555) itemoff -385621303 itemsize -1014793681 item 13 key (12392706351935785292 TREE_BLOCK_REF 17075682359779944300) itemoff 467435242 itemsize -1974352848 tree block backref item 14 key (9030638330689148475 UNKNOWN.146 16510052416438219760) itemoff -1329727247 itemsize -989772882 item 15 key (2557232588403612193 UNKNOWN.89 11359249297629415033) itemoff -1393664382 itemsize -222178533 item 16 key (16832668804185527807 UNKNOWN.190 12813564574805698827) itemoff -824350641 itemsize 113587270 item 17 key (17721977661761488041 UNKNOWN.133 65181195353232031) itemoff 1165455420 itemsize -11248999 item 18 key (17041494636387836535 UNKNOWN.146 659630272632027956) itemoff 1646352770 itemsize 188954807 item 19 key (4813797791329885851 UNKNOWN.147 2988230942665281926) itemoff 2034137186 itemsize 429359084 item 20 key (11925872190557602809 UNKNOWN.28 10017979389672184473) itemoff 198274722 itemsize 1654501802 item 21 key (18089916911465221293 UNKNOWN.215 130744227189807288) itemoff -938569572 itemsize -322594079 item 22 key (17582525817082834821 UNKNOWN.133 14298100207216235213) itemoff 997305640 itemsize 380205383 item 23 key (2509730330338250179 ORPHAN_ITEM 8415032273173690331) itemoff 1213495256 itemsize -1813460706 orphan item item 24 key (17657358590741059587 UNKNOWN.5 4198714773705203243) itemoff -690501330 itemsize -237182892 item 25 key (14784171376049469241 UNKNOWN.139 15453005915765327150) itemoff 1543890422 itemsize 2093403168 item 26 key (8296048569161577100 UNKNOWN.58 12559616442258240580) itemoff 927535366 itemsize -620630864 item 27 key (14738413134752477244 SHARED_BLOCK_REF 90867799437527556) itemoff -629160915 itemsize 1418942359 shared block backref item 28 key (17386064595326971933 SHARED_BLOCK_REF 1813311842215708701) itemoff 1401681450 itemsize -2016124808 shared block backref item 29 key (12068018374989506977 UNKNOWN.160 1560146733122974605) itemoff -1145774613 itemsize -490403576 item 30 key (5611751644962296316 QGROUP_LIMIT 19245/207762978715732) itemoff -433607332 itemsize -854595036 Segmentation fault (core dumped) Can I simply rebuild the csum tree (btrfs check --init-csum-tree)? The entire contents of the fs are back-up files that are hashed so I can verify that the files are correct. Please note that this corruption could be caused by bad ram or some old kernel bug. It's recommend to run a memtest if possible. The system uses ECC memory and edac-util has not reported any errors. However, I will run a memtest anyway. Thank you, --Larkin
Scrub aborts due to corrupt leaf
When I do a scrub it aborts about 10% of the way in due to: corrupt leaf: root=7 block=7687860535296 slot=0, invalid key objectid for csum item, have 18446744073650847734 expect 18446744073709551606 The filesystem in question stores my backups and I have verified all of the backups so I know all files that are supposed to be there are there and their hashes match. Backups run normally and everything seems to work fine, it's just the scrub that doesn't. I tried: # btrfs check --repair /dev/Cached/Backups enabling repair mode Checking filesystem on /dev/Cached/Backups UUID: acff5096-1128-4b24-a15e-4ba04261edc3 Fixed 0 roots. checking extents leaf free space ret -2002721201, leaf data size 16283, used 2002737484 nritems 319 leaf free space ret -2002721201, leaf data size 16283, used 2002737484 nritems 319 leaf free space incorrect 7687860535296 -2002721201 bad block 7687860535296 ERROR: errors found in extent allocation tree or chunk allocation checking free space cache block group 34028518375424 has wrong amount of free space failed to load free space cache for block group 34028518375424 checking fs roots root 5 inode 6784890 errors 1000, some csum missing checking csums there are no extents for csum range 6447630387207159216-6447630390115868080 csum exists for 6447630387207159216-6447630390115868080 but there is no extent record there are no extents for csum range 763548178418734000-763548181428650928 csum exists for 763548178418734000-763548181428650928 but there is no extent record there are no extents for csum range 10574442573086800664-10574442573732416280 csum exists for 10574442573086800664-10574442573732416280 but there is no extent record ERROR: errors found in csum tree found 73238589853696 bytes used, error(s) found total csum bytes: 8117840900 total tree bytes: 34106834944 total fs tree bytes: 23289413632 total extent tree bytes: 1659682816 btree space waste bytes: 6020692848 file data blocks allocated: 73136347418624 referenced 73135917441024 Nothing changes because when I run the above command again the output is identical. I had been using space_cache v2 but reverted to nospace_cache to run the above. Is there any way to clean this up? kernel 4.17.14-202.fc28.x86_64 btrfs-progs v4.15.1 Label: none uuid: acff5096-1128-4b24-a15e-4ba04261edc3 Total devices 1 FS bytes used 66.61TiB devid 1 size 72.77TiB used 68.03TiB path /dev/mapper/Cached-Backups Data, single: total=67.80TiB, used=66.52TiB System, DUP: total=40.00MiB, used=7.41MiB Metadata, DUP: total=98.50GiB, used=95.21GiB GlobalReserve, single: total=512.00MiB, used=0.00B BTRFS info (device dm-3): disk space caching is enabled BTRFS info (device dm-3): has skinny extents BTRFS info (device dm-3): bdev /dev/mapper/Cached-Backups errs: wr 0, rd 0, flush 0, corrupt 666, gen 25 BTRFS info (device dm-3): enabling ssd optimizations
Unmountable fs. No root for superblock generation
I am unable to mount one my my filesystems. The superblock thinks the latest generation is 2220927 but I can't seem to find a root with that number. I can find 2220926 and 2220928 but not 2220927. Is there anything that I can do to recover this FS? # btrfs check /dev/Cached/Backups checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 Csum didn't match Couldn't setup extent tree Couldn't open file system # btrfs-find-root -g 2220927 /dev/Cached/Backups Couldn't setup extent tree Couldn't setup device tree Superblock thinks the generation is 2220927 Superblock thinks the level is 2 Found tree root at 159057884577792 gen 2220927 level 2 Well block 101489031790592(gen: 2220928 level: 2) seems good, but generation/level doesn't match, want gen: 2220927 level: 2 # btrfs check --tree-root 159057884577792 /dev/Cached/Backups checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 Csum didn't match Couldn't setup extent tree Couldn't open file system # btrfs check --tree-root 101489031790592 /dev/Cached/Backups parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 Ignoring transid failure parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 Ignoring transid failure parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 Ignoring transid failure Checking filesystem on /dev/Cached/Backups UUID: 1b213dfd-6486-47d8-8459-bc5825882023 checking extents parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116480660914176 wanted
Unmountable fs - missing generation?
I am unable to mount one my my filesystems. The superblock thinks the latest generation is 2220927 but I can't seem to find a root with that number. I can find 2220926 and 2220928 but not 2220927. Is there anything that I can do to recover this FS? # btrfs check /dev/Cached/Backups checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 Csum didn't match Couldn't setup extent tree Couldn't open file system # btrfs-find-root -g 2220927 /dev/Cached/Backups Couldn't setup extent tree Couldn't setup device tree Superblock thinks the generation is 2220927 Superblock thinks the level is 2 Found tree root at 159057884577792 gen 2220927 level 2 Well block 101489031790592(gen: 2220928 level: 2) seems good, but generation/level doesn't match, want gen: 2220927 level: 2 # btrfs check --tree-root 159057884577792 /dev/Cached/Backups checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 15284E33 wanted C8C5B54E checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 checksum verify failed on 159057884594176 found 472037C9 wanted 9ACDCCB4 Csum didn't match Couldn't setup extent tree Couldn't open file system # btrfs check --tree-root 101489031790592 /dev/Cached/Backups parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 parent transid verify failed on 101489031790592 wanted 2220927 found 2220928 Ignoring transid failure parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 parent transid verify failed on 159057595138048 wanted 2220927 found 2220920 Ignoring transid failure parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 parent transid verify failed on 158652658122752 wanted 2220927 found 2220911 Ignoring transid failure Checking filesystem on /dev/Cached/Backups UUID: 1b213dfd-6486-47d8-8459-bc5825882023 checking extents parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 parent transid verify failed on 116329711550464 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 parent transid verify failed on 116325928206336 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 parent transid verify failed on 116329892970496 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 parent transid verify failed on 116325929943040 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 parent transid verify failed on 116325932679168 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 parent transid verify failed on 116010673373184 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 parent transid verify failed on 116329479405568 wanted 2220928 found 2220921 Ignoring transid failure parent transid verify failed on 116480660914176 wanted
Re: Heavy nocow'd VM image fragmentation
On 10/24/2014 10:28 PM, Duncan wrote: Robert White posted on Fri, 24 Oct 2014 19:41:32 -0700 as excerpted: On 10/24/2014 04:49 AM, Marc MERLIN wrote: On Thu, Oct 23, 2014 at 06:04:43PM -0500, Larkin Lowrey wrote: I have a 240GB VirtualBox vdi image that is showing heavy fragmentation (filefrag). The file was created in a dir that was chattr +C'd, the file was created via fallocate and the contents of the orignal image were copied into the file via dd. I verified that the image was +C. To be honest, I have the same problem, and it's vexing: If I understand correctly, when you take a snapshot the file goes into what I call 1COW mode. Yes, but the OP said he hadn't snapshotted since creating the file, and MM's a regular that actually wrote much of the wiki documentation on raid56 modes, so he better know about the snapshotting problem too. So that can't be it. There's apparently a bug in some recent code, and it's not honoring the NOCOW even in normal operation, when it should be. (FWIW I'm not running any VMs or large DBs here, so don't have nocow set on anything and can and do use autodefrag on all my btrfs. So I can't say one way or the other, personally.) Correct, there were no snapshots during VM usage when the fragmentation occurred. One unusual property of my setup is I have my fs on top of bcache. More specifically, the stack is md raid6 - bcache - lvm - btrfs. When the fs mounts it has mount option 'ssd' due to the fact that bcache sets /sys/block/bcache0/queue/rotational to 0. Is there any reason why either the 'ssd' mount option or being backed by bcache could be responsible? --Larkin -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Heavy nocow'd VM image fragmentation
I have a 240GB VirtualBox vdi image that is showing heavy fragmentation (filefrag). The file was created in a dir that was chattr +C'd, the file was created via fallocate and the contents of the orignal image were copied into the file via dd. I verified that the image was +C. After initial creation there were about 2800 fragments, according to filefrag. That doesn't surprise me because this image took up about 60% of the free space. After an hour of light use the filefrag count was the same. But, after a day of heavy use, the count is now well over 600,000. There were no snapshots during the period of use. The fs does not have compression enabled. These usual suspects don't apply in my case. The process I used to copy the image to a noCOW image was: fallocate -n -l $(stat --format %s old.vdi) new.vdi dd if=old.vdi of=new.vdi conv=notrunc oflags=append bs=1M Performance does seem much worse in the VM but could it be that the image isn't actually severely fragmented and I'm just misunderstanding the output from filefrag? Is there a problem with how I copied over the old image file? --Larkin -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
btrfsck check infinite loop
I ran 'btrfs check --repair --init-extent-tree' and appear to be in an infinite loop. It performed heavy IO for about 1.5 hours then the IO stopped and the CPU stayed at 100%. It's been like that for more than 12 hours now. I made a hardware change last week that resulted in unstable RAM so I suspect some corrupt data was written to disk. I tried mounting with -orecovery,clear_cache,nospace_cache but I would get a panic shortly thereafter. I tried 'btrfs check --repair' but also got a panic. I finally tried 'btrfs check --repair --init-extent-tree' and hit an assertion failed error with btrfs-progs 3.16. After noticing some promising commits, I built from the integration repo (kdave), re-ran (v3.16.1) and got further (2hrs) but then got stuck in this infinite loop. Here's the backtrace of where it is now and has been for hours: #0 0x00438f01 in free_some_buffers (tree=0xda3078) at extent_io.c:553 #1 __alloc_extent_buffer (blocksize=4096, bytenr=optimized out, tree=0xda3078) at extent_io.c:592 #2 alloc_extent_buffer (tree=0xda3078, bytenr=optimized out, blocksize=4096) at extent_io.c:671 #3 0x0042be29 in btrfs_find_create_tree_block (root=root@entry=0xda34a0, bytenr=optimized out, blocksize=optimized out) at disk-io.c:133 #4 0x0042d683 in read_tree_block (root=0xda34a0, bytenr=optimized out, blocksize=optimized out, parent_transid=161580) at disk-io.c:260 #5 0x00427c58 in read_node_slot (root=root@entry=0xda34a0, parent=parent@entry=0x165ab88c0, slot=slot@entry=43) at ctree.c:634 #6 0x00428558 in push_leaf_right (trans=trans@entry=0xe709b0, root=root@entry=0xda34a0, path=path@entry=0xde317a0, data_size=data_size@entry=67, empty=empty@entry=0) at ctree.c:1608 #7 0x00428e4c in split_leaf (trans=trans@entry=0xe709b0, root=root@entry=0xda34a0, ins_key=ins_key@entry=0x7fff24da24b0, path=path@entry=0xde317a0, data_size=data_size@entry=67, extend=extend@entry=0) at ctree.c:1977 #8 0x0042aa54 in btrfs_search_slot (trans=0xe709b0, root=root@entry=0xda34a0, key=key@entry=0x7fff24da24b0, p=p@entry=0xde317a0, ins_len=ins_len@entry=67, cow=cow@entry=1) at ctree.c:1120 #9 0x0042af51 in btrfs_insert_empty_items (trans=trans@entry=0xe709b0, root=root@entry=0xda34a0, path=path@entry=0xde317a0, cpu_key=cpu_key@entry=0x7fff24da24b0, data_size=data_size@entry=0x7fff24da24a0, nr=nr@entry=1) at ctree.c:2412 #10 0x004175f6 in btrfs_insert_empty_item (data_size=42, key=0x7fff24da24b0, path=0xde317a0, root=0xda34a0, trans=0xe709b0) at ctree.h:2312 #11 record_extent (flags=0, allocated=optimized out, back=0x95cb3d90, rec=0x95cb3cc0, path=0xde317a0, info=0xda3010, trans=0xe709b0) at cmds-check.c:4438 #12 fixup_extent_refs (trans=trans@entry=0xe709b0, info=optimized out, extent_cache=extent_cache@entry=0x7fff24da2970, rec=rec@entry=0x95cb3cc0) at cmds-check.c:5287 #13 0x0041ac01 in check_extent_refs (extent_cache=0x7fff24da2970, root=optimized out, trans=optimized out) at cmds-check.c:5511 #14 check_chunks_and_extents (root=root@entry=0xfa7c70) at cmds-check.c:5978 #15 0x0041bdd9 in cmd_check (argc=optimized out, argv=optimized out) at cmds-check.c:6723 #16 0x00404481 in main (argc=4, argv=0x7fff24da2fe0) at btrfs.c:247 I checked node, node-next, node-next-next, node-next-prev, etc. and saw no obvious loop, at least not in the immediate vicinity of node. The value of node is different each time I check it. I'll periodically see the following backtrace: #0 __list_del (next=0x1326fe820, prev=0xda3088) at list.h:113 #1 list_move_tail (head=0xda3088, list=0x1514b40f0) at list.h:183 #2 free_some_buffers (tree=0xda3078) at extent_io.c:560 #3 __alloc_extent_buffer (blocksize=4096, bytenr=optimized out, tree=0xda3078) at extent_io.c:592 #4 alloc_extent_buffer (tree=0xda3078, bytenr=optimized out, blocksize=4096) at extent_io.c:671 #5 0x0042be29 in btrfs_find_create_tree_block (root=root@entry=0xda34a0, bytenr=optimized out, blocksize=optimized out) at disk-io.c:133 #6 0x0042d683 in read_tree_block (root=0xda34a0, bytenr=optimized out, blocksize=optimized out, parent_transid=161580) at disk-io.c:260 #7 0x00427c58 in read_node_slot (root=root@entry=0xda34a0, parent=parent@entry=0x165ab88c0, slot=slot@entry=43) at ctree.c:634 #8 0x00428558 in push_leaf_right (trans=trans@entry=0xe709b0, root=root@entry=0xda34a0, path=path@entry=0xde317a0, data_size=data_size@entry=67, empty=empty@entry=0) at ctree.c:1608 #9 0x00428e4c in split_leaf (trans=trans@entry=0xe709b0, root=root@entry=0xda34a0, ins_key=ins_key@entry=0x7fff24da24b0, path=path@entry=0xde317a0, data_size=data_size@entry=67, extend=extend@entry=0) at ctree.c:1977 #10 0x0042aa54 in btrfs_search_slot (trans=0xe709b0, root=root@entry=0xda34a0, key=key@entry=0x7fff24da24b0, p=p@entry=0xde317a0, ins_len=ins_len@entry=67, cow=cow@entry=1) at ctree.c:1120 #11 0x0042af51 in
Re: btrfsck check infinite loop
I noticed the following: (gdb) print nrscan $19 = 1680726970 (gdb) print tree-cache_size $20 = 1073741824 (gdb) print cache_hard_max $21 = 1073741824 It appears that cache_size can not shrink below cache_hard_max so we never end up breaking out of the loop. The FS in question is 30TB with ~26TB in use. Perhaps cache_hard_max (1GB) is too small for this size FS? I just bumped it to 2GB and am re-running to see if that helps. --Larkin On 9/24/2014 9:27 AM, Larkin Lowrey wrote: I ran 'btrfs check --repair --init-extent-tree' and appear to be in an infinite loop. It performed heavy IO for about 1.5 hours then the IO stopped and the CPU stayed at 100%. It's been like that for more than 12 hours now. I made a hardware change last week that resulted in unstable RAM so I suspect some corrupt data was written to disk. I tried mounting with -orecovery,clear_cache,nospace_cache but I would get a panic shortly thereafter. I tried 'btrfs check --repair' but also got a panic. I finally tried 'btrfs check --repair --init-extent-tree' and hit an assertion failed error with btrfs-progs 3.16. After noticing some promising commits, I built from the integration repo (kdave), re-ran (v3.16.1) and got further (2hrs) but then got stuck in this infinite loop. Here's the backtrace of where it is now and has been for hours: #0 0x00438f01 in free_some_buffers (tree=0xda3078) at extent_io.c:553 #1 __alloc_extent_buffer (blocksize=4096, bytenr=optimized out, tree=0xda3078) at extent_io.c:592 #2 alloc_extent_buffer (tree=0xda3078, bytenr=optimized out, blocksize=4096) at extent_io.c:671 #3 0x0042be29 in btrfs_find_create_tree_block (root=root@entry=0xda34a0, bytenr=optimized out, blocksize=optimized out) at disk-io.c:133 #4 0x0042d683 in read_tree_block (root=0xda34a0, bytenr=optimized out, blocksize=optimized out, parent_transid=161580) at disk-io.c:260 #5 0x00427c58 in read_node_slot (root=root@entry=0xda34a0, parent=parent@entry=0x165ab88c0, slot=slot@entry=43) at ctree.c:634 #6 0x00428558 in push_leaf_right (trans=trans@entry=0xe709b0, root=root@entry=0xda34a0, path=path@entry=0xde317a0, data_size=data_size@entry=67, empty=empty@entry=0) at ctree.c:1608 #7 0x00428e4c in split_leaf (trans=trans@entry=0xe709b0, root=root@entry=0xda34a0, ins_key=ins_key@entry=0x7fff24da24b0, path=path@entry=0xde317a0, data_size=data_size@entry=67, extend=extend@entry=0) at ctree.c:1977 #8 0x0042aa54 in btrfs_search_slot (trans=0xe709b0, root=root@entry=0xda34a0, key=key@entry=0x7fff24da24b0, p=p@entry=0xde317a0, ins_len=ins_len@entry=67, cow=cow@entry=1) at ctree.c:1120 #9 0x0042af51 in btrfs_insert_empty_items (trans=trans@entry=0xe709b0, root=root@entry=0xda34a0, path=path@entry=0xde317a0, cpu_key=cpu_key@entry=0x7fff24da24b0, data_size=data_size@entry=0x7fff24da24a0, nr=nr@entry=1) at ctree.c:2412 #10 0x004175f6 in btrfs_insert_empty_item (data_size=42, key=0x7fff24da24b0, path=0xde317a0, root=0xda34a0, trans=0xe709b0) at ctree.h:2312 #11 record_extent (flags=0, allocated=optimized out, back=0x95cb3d90, rec=0x95cb3cc0, path=0xde317a0, info=0xda3010, trans=0xe709b0) at cmds-check.c:4438 #12 fixup_extent_refs (trans=trans@entry=0xe709b0, info=optimized out, extent_cache=extent_cache@entry=0x7fff24da2970, rec=rec@entry=0x95cb3cc0) at cmds-check.c:5287 #13 0x0041ac01 in check_extent_refs (extent_cache=0x7fff24da2970, root=optimized out, trans=optimized out) at cmds-check.c:5511 #14 check_chunks_and_extents (root=root@entry=0xfa7c70) at cmds-check.c:5978 #15 0x0041bdd9 in cmd_check (argc=optimized out, argv=optimized out) at cmds-check.c:6723 #16 0x00404481 in main (argc=4, argv=0x7fff24da2fe0) at btrfs.c:247 I checked node, node-next, node-next-next, node-next-prev, etc. and saw no obvious loop, at least not in the immediate vicinity of node. The value of node is different each time I check it. I'll periodically see the following backtrace: #0 __list_del (next=0x1326fe820, prev=0xda3088) at list.h:113 #1 list_move_tail (head=0xda3088, list=0x1514b40f0) at list.h:183 #2 free_some_buffers (tree=0xda3078) at extent_io.c:560 #3 __alloc_extent_buffer (blocksize=4096, bytenr=optimized out, tree=0xda3078) at extent_io.c:592 #4 alloc_extent_buffer (tree=0xda3078, bytenr=optimized out, blocksize=4096) at extent_io.c:671 #5 0x0042be29 in btrfs_find_create_tree_block (root=root@entry=0xda34a0, bytenr=optimized out, blocksize=optimized out) at disk-io.c:133 #6 0x0042d683 in read_tree_block (root=0xda34a0, bytenr=optimized out, blocksize=optimized out, parent_transid=161580) at disk-io.c:260 #7 0x00427c58 in read_node_slot (root=root@entry=0xda34a0, parent=parent@entry=0x165ab88c0, slot=slot@entry=43) at ctree.c:634 #8 0x00428558 in push_leaf_right (trans=trans@entry=0xe709b0, root=root@entry=0xda34a0, path=path
Re: btrfs on bcache
I've been running two backup servers, with 25T and 20T of data, using btrfs on bcache (writeback) for about 7 months. I periodically run btrfs scrubs and backup verifies (SHA1 hashes) and have never had a corruption issue. My use of btrfs is simple, though, with no subvolumes and no btrfs level raid. My bcache backing devices are LVM volumes that span multiple md raid6 arrays. So, either the bug has been fixed or my configuration is not susceptible. I'm running kernel 3.15.5-200.fc20.x86_64. --Larkin On 7/30/2014 5:04 PM, dptr...@arcor.de wrote: Concerning http://thread.gmane.org/gmane.comp.file-systems.btrfs/31018, does this bug still exists? Kernel 3.14 B: 2x HDD 1 TB C: 1x SSD 256 GB # make-bcache -B /dev/sda /dev/sdb -C /dev/sdc --cache_replacement_policy=lru # mkfs.btrfs -d raid1 -m raid1 -L BTRFS_RAID /dev/bcache0 /dev/bcache1 I still have no incomplete page write messages in dmesg | grep btrfs and the checksums of some manually reviewed files are okay. Who has more experiences about this? Thanks, - dp -- To unsubscribe from this list: send the line unsubscribe linux-bcache in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html