Re: [PATCH v4 00/20] Btrfs-progs offline scrub
> For any one who wants to try it, it can be get from my repo: > https://github.com/adam900710/btrfs-progs/tree/offline_scrub While running single corruption script receive message like "REPARIED: corrupted data with good P/Q, repaired" along with message from script which says "Parity stripe check passed." complete output here: https://github.com/Lakshmipathi/btrfs_offline_scrub/blob/master/logs-june18/single_corruption_misc-tests-results.txt multiple-corruption: While corrupting continuous blocks received: >Filename=file256k.txt Total Stripes=4 Data Stripe to be corrupted=2,3 .. ..//debugfs-tree output ends >ERROR: full stripe 145358848 CORRUPTED: too many read error or corrupted >devices >ERROR: full stripe 145358848: tolerance: 1, missing: 0, read error: 0, csum >error: 2 While corrupting non-continuous blocks: >Filename=file512k.txt Total Stripes=8 Data Stripe to be corrupted=1,3,5 .. ..//debugfs-tree output ends >full stripe 145227776 REPARIED: corrupted data with good P/Q, repaired >full stripe 145358848 REPARIED: corrupted data with good P/Q, repaired >full stripe 145489920 REPARIED: corrupted data with good P/Q, repaired output for other combinations like : >Filename=file768k.txt Total Stripes=12 Data Stripe to be >corrupted=10,8,6,4 >Filename=file1m.txt Total Stripes=16 Data Stripe to be >corrupted=14,12,10,8,7,6 >Filename=file2m.txt Total Stripes=32 Data Stripe to be >corrupted=23,22,21,20,19,18,16 >Filename=file4m.txt Total Stripes=64 Data Stripe to be >corrupted=34,33,32,31,30,20,18,15,10,8,5 >Filename=file8m.txt Total Stripes=128 Data Stripe to be >corrupted=100,90,80,70,60,50,40,30,20,10 can be found here: https://github.com/Lakshmipathi/btrfs_offline_scrub/blob/master/logs-june18/multiple_corruptions_misc-tests-results.txt Outputs are looking fine, but more testing required with different file-types :-) thanks. Cheers, Lakshmipathi.G -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 00/20] Btrfs-progs offline scrub
On Tue, May 30, 2017 at 08:54:32PM +0200, David Sterba wrote: > On Thu, May 25, 2017 at 02:21:45PM +0800, Qu Wenruo wrote: > > For any one who wants to try it, it can be get from my repo: > > https://github.com/adam900710/btrfs-progs/tree/offline_scrub > > Qu Wenruo (20): > > btrfs-progs: raid56: Introduce raid56 header for later recovery usage > > btrfs-progs: raid56: Introduce tables for RAID6 recovery > > btrfs-progs: raid56: Allow raid6 to recover 2 data stripes > > btrfs-progs: raid56: Allow raid6 to recover data and p Patches 1-5 applied, with small fixups here and there, that took me more time than I wanted. Yet I'm not sure if I should bother you with the coding style things and grammar fixes. Maybe I should, it becomes too distracting namely in the rest of the patch series. > > btrfs-progs: Introduce wrapper to recover raid56 data > > btrfs-progs: Introduce new btrfs_map_block function which returns more > > unified result. > > btrfs-progs: Allow __btrfs_map_block_v2 to remove unrelated stripes > > btrfs-progs: csum: Introduce function to read out data csums > > I'm about to start merging this patches, in parts. First the patches 1-8 > as they're independent and not intrusive. > > > btrfs-progs: scrub: Introduce structures to support offline scrub for > > RAID56 > > btrfs-progs: scrub: Introduce functions to scrub mirror based tree > > block > > btrfs-progs: scrub: Introduce functions to scrub mirror based data > > blocks > > btrfs-progs: scrub: Introduce function to scrub one mirror-based > > extent > > btrfs-progs: scrub: Introduce function to scrub one data stripe > > btrfs-progs: scrub: Introduce function to verify parities > > btrfs-progs: extent-tree: Introduce function to check if there is any > > extent in given range. > > btrfs-progs: scrub: Introduce function to recover data parity > > btrfs-progs: scrub: Introduce helper to write a full stripe > > btrfs-progs: scrub: Introduce a function to scrub one full stripe > > btrfs-progs: scrub: Introduce function to check a whole block group > > btrfs-progs: scrub: Introduce offline scrub function -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 00/20] Btrfs-progs offline scrub
On Thu, May 25, 2017 at 02:21:45PM +0800, Qu Wenruo wrote: > For any one who wants to try it, it can be get from my repo: > https://github.com/adam900710/btrfs-progs/tree/offline_scrub > Qu Wenruo (20): > btrfs-progs: raid56: Introduce raid56 header for later recovery usage > btrfs-progs: raid56: Introduce tables for RAID6 recovery > btrfs-progs: raid56: Allow raid6 to recover 2 data stripes > btrfs-progs: raid56: Allow raid6 to recover data and p > btrfs-progs: Introduce wrapper to recover raid56 data > btrfs-progs: Introduce new btrfs_map_block function which returns more > unified result. > btrfs-progs: Allow __btrfs_map_block_v2 to remove unrelated stripes > btrfs-progs: csum: Introduce function to read out data csums I'm about to start merging this patches, in parts. First the patches 1-8 as they're independent and not intrusive. > btrfs-progs: scrub: Introduce structures to support offline scrub for > RAID56 > btrfs-progs: scrub: Introduce functions to scrub mirror based tree > block > btrfs-progs: scrub: Introduce functions to scrub mirror based data > blocks > btrfs-progs: scrub: Introduce function to scrub one mirror-based > extent > btrfs-progs: scrub: Introduce function to scrub one data stripe > btrfs-progs: scrub: Introduce function to verify parities > btrfs-progs: extent-tree: Introduce function to check if there is any > extent in given range. > btrfs-progs: scrub: Introduce function to recover data parity > btrfs-progs: scrub: Introduce helper to write a full stripe > btrfs-progs: scrub: Introduce a function to scrub one full stripe > btrfs-progs: scrub: Introduce function to check a whole block group > btrfs-progs: scrub: Introduce offline scrub function -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 00/20] Btrfs-progs offline scrub
On 2017-05-29 02:21, Qu Wenruo wrote: > > > At 05/27/2017 02:37 AM, Goffredo Baroncelli wrote: >> Hi Qu, >> >> On 2017-05-25 08:21, Qu Wenruo wrote: >> >>> And since kernel scrub won't account P/Q corruption, it makes us quite >>> hard to detect error like kernel screwing up P/Q when scrubbing. >> >> could you elaborate the above sentence: in my test the kernel-scrub >> (4.12.0-rc2) is able to correct a wrong 'P' parity; am I missing something ? > > That's because Liu Bo and I exposed and fixed the bug. > > Or the case will still be the same. > > Thanks, > Qu Ok, thank for your clarification. BR G.Baroncelli > >> >> BR >> G.Baroncelli >> [...] >> > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 00/20] Btrfs-progs offline scrub
At 05/27/2017 02:37 AM, Goffredo Baroncelli wrote: Hi Qu, On 2017-05-25 08:21, Qu Wenruo wrote: And since kernel scrub won't account P/Q corruption, it makes us quite hard to detect error like kernel screwing up P/Q when scrubbing. could you elaborate the above sentence: in my test the kernel-scrub (4.12.0-rc2) is able to correct a wrong 'P' parity; am I missing something ? That's because Liu Bo and I exposed and fixed the bug. Or the case will still be the same. Thanks, Qu BR G.Baroncelli [...] -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v4 00/20] Btrfs-progs offline scrub
Hi Qu, On 2017-05-25 08:21, Qu Wenruo wrote: > And since kernel scrub won't account P/Q corruption, it makes us quite > hard to detect error like kernel screwing up P/Q when scrubbing. could you elaborate the above sentence: in my test the kernel-scrub (4.12.0-rc2) is able to correct a wrong 'P' parity; am I missing something ? BR G.Baroncelli [...] -- gpg @keyserver.linux.it: Goffredo Baroncelli Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v4 00/20] Btrfs-progs offline scrub
For any one who wants to try it, it can be get from my repo: https://github.com/adam900710/btrfs-progs/tree/offline_scrub Several reports on kernel scrub screwing up good data stripes are in ML for sometime. And since kernel scrub won't account P/Q corruption, it makes us quite hard to detect error like kernel screwing up P/Q when scrubbing. To get a comparable tool for kernel scrub, we need a user-space tool to act as benchmark to compare their different behaviors. So here is the patchset for user-space scrub. Which can do: 1) All mirror/backup check for non-parity based stripe Which means for RAID1/DUP/RAID10, we can really check all mirrors other than the 1st good mirror. Current "--check-data-csum" option should be finally replaced by offline scrub. As "--check-data-csum" doesn't really check all mirrors, if it hits a good copy, then resting copies will just be ignored. In v4 update, data check is further improved, inspired by kernel behavior, now data extent is checked sector by sector, so it can handle the following corruption case: Data extent A contains data from 0~28K. And |///| = corrupted | | = good 0 4k 8k 12k 16k 20k 24k 28k Mirror 0 |///| |///| |///| | | Mirror 1 | |///| |///| |///| | Extent A should be RECOVERABLE, while in v3 we treat data extent A as a whole unit, above case is reported as CORRUPTED. 2) RAID5/6 full stripe check It will take full use of btrfs csum(both tree and data). It will only recover the full stripe if all recovered data matches with its csum. NOTE: Due to the lack of good bitmap facilities, RAID56 sector by sector repair will be quite complex, especially when NODATASUM is involved. So current RAID56 doesn't support vertical sector recovery yet. Data extent A contains data from 0~64K And |///| = corrupted while | | = good 0 8K 16K 24K 32K 40K 48K 56K 64K Data stripe 0 |///| |///| |///| |///| | Data stripe 1 | |///| |///| |///| |///| Parity | | | | | | | | | Kernel will recover it, while current scrub will report it as CORRUPTED. 3) Repair In this v4 update, repair is finally added. And this patchset also introduces new btrfs_map_block() function, which is more flex than current btrfs_map_block(), and has a unified interface for all profiles, not just an extra array for RAID56. Check the 6th and 7th patch for details. They are already used in RAID5/6 scrub, but can also be used for other profiles too. The to-do list has been shortened, since repair is added in v4 update. 1) Test cases Need to make the infrastructure able to handle multi-device first. 2) Make btrfsck able to handle RAID5 with missing device Now it doesn't even open RAID5 btrfs with missing device, even though scrub should be able to handle it. 3) RAID56 vertical sector repair Although I consider such case is minor compared to RAID1 vertical sector repair. As for RAID1, an extent can be as large as 128M, while for RAID56 one stripe will always be 64K, much smaller than RAID1 case, making the possibility lower. I prefer to add this function after the patchset get merged, as no one really likes get 20 mails every time I update the patchset. For guys who want to review the patchset, there is a basic function relationships slide. I hope this will reduce the time needed to get what the patchset is doing. https://docs.google.com/presentation/d/1tAU3lUVaRUXooSjhFaDUeyW3wauHDSg9H-AiLBOSuIM/edit?usp=sharing Qu Wenruo (20): btrfs-progs: raid56: Introduce raid56 header for later recovery usage btrfs-progs: raid56: Introduce tables for RAID6 recovery btrfs-progs: raid56: Allow raid6 to recover 2 data stripes btrfs-progs: raid56: Allow raid6 to recover data and p btrfs-progs: Introduce wrapper to recover raid56 data btrfs-progs: Introduce new btrfs_map_block function which returns more unified result. btrfs-progs: Allow __btrfs_map_block_v2 to remove unrelated stripes btrfs-progs: csum: Introduce function to read out data csums btrfs-progs: scrub: Introduce structures to support offline scrub for RAID56 btrfs-progs: scrub: Introduce functions to scrub mirror based tree block btrfs-progs: scrub: Introduce functions to scrub mirror based data blocks btrfs-progs: scrub: Introduce function to scrub one mirror-based extent btrfs-progs: scrub: Introduce function to scrub one data stripe btrfs-progs: scrub: Introduce function to verify parities btrfs-progs: extent-tree: Introduce function to check if there is any extent in given range. btrfs-progs: scrub: Introduce function to recover data parity btrfs-progs: scrub: Introduce helper to write a full stripe btrfs-progs: scrub: Introduce a function to scrub one full stripe btrfs-progs: scrub: Introduce function to check a whole block group btrfs-progs: scrub: Introduce