On Wed, Sep 21, 2016 at 8:08 PM, Qu Wenruo <quwen...@cn.fujitsu.com> wrote: > > > At 09/21/2016 11:13 PM, Chris Murphy wrote:
>> I understand some things should go in fsck for comparison. But in this >> case I don't see how it can help. Parity is not checksummed. The only >> way to know if it's wrong is to read all of the data strips, compute >> parity, and compare in-memory parity from current read to on-disk >> parity. > > > That's what we plan to do. > And I don't see the necessary to csum the parity. > Why csum a csum again? parity!=csum http://www.spinics.net/lists/linux-btrfs/msg56602.html >> There is already an offline scrub in btrfs >> check which doesn't repair, but also I don't know if it checks parity. >> >> --check-data-csum >> verify checksums of data blocks > > > Just as you expected, it doesn't check parity. > Even for RAID1/DUP, it won't check the backup if it succeeded reading the > first stripe. Both copies are not scrubbed? Oh hell... [chris@f24s ~]$ sudo btrfs scrub status /brick2 scrub status for 7fea4120-9581-43cb-ab07-6631757c0b55 scrub started at Tue Sep 20 12:16:18 2016 and finished after 01:46:58 total bytes scrubbed: 955.93GiB with 0 errors How can this possibly correctly say 956GiB scrubbed if it has not checked both copies? That message is saying *all* the data, both copies, were scrubbed. You're saying that message is wrong? It only scrubbed half that amount? [chris@f24s ~]$ sudo btrfs fi df /brick2 Data, RAID1: total=478.00GiB, used=477.11GiB System, RAID1: total=32.00MiB, used=96.00KiB Metadata, RAID1: total=2.00GiB, used=877.59MiB GlobalReserve, single: total=304.00MiB, used=0.00B When that scrub was happening, both drives were being accessed at 100% throughput. > > Current implement doesn't really care if it's the data or the copy > corrupted, any data can be read out, then there is no problem. > The same thing applies to tree blocks. > > So the ability to check every stripe/copy is still quite needed for that > option. > > And that's what I'm planning to enhance, make --check-data-csum to kernel > scrub equivalent. OK thanks. > >> >> This expects that the filesystem is otherwise OK, so this >> is basically and >> offline scrub but does not repair data from spare coipes. > > > Repair can be implemented, but maybe just rewrite the same data into the > same place. > If that's a bad block, then it can't repair further more unless we can > relocate extent to other place. Any device that's out of reserve sectors and can no longer remap LBA's on its own, is a drive that needs to be decommissioned. It's a new feature in just the last year or so that mdadm has a badblocks map so it can do what the drive won't do, but I'm personally not a fan of keeping malfunctioning drives in RAID. > >> >> Is it possible to put parities into their own tree? They'd be >> checksummed there. > > > Personally speaking, this is quite a bad idea to me. > I prefer to separate different logical layers into their own codes. > Not mixing them together. > > Block level things to block level(RAID/Chunk), logical thing to logical > level(tree blocks). OK. > > Current btrfs csum design is already much much better than pure RAID. > Just think of RAID1, while one copy is corrupted, then which copy is correct > then? Yes. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html