Re: [BUG] Btrfs scrub sometime recalculate wrong parity in raid5

Qu Wenruo Wed, 21 Sep 2016 19:10:15 -0700


At 09/21/2016 11:13 PM, Chris Murphy wrote:

On Wed, Sep 21, 2016 at 3:15 AM, Qu Wenruo <[email protected]> wrote:



At 09/21/2016 03:35 PM, Tomasz Torcz wrote:


On Wed, Sep 21, 2016 at 03:28:25PM +0800, Qu Wenruo wrote:


Hi,

For this well-known bug, is there any one fixing it?

It can't be more frustrating finding some one has already worked on it
after
spending days digging.

BTW, since kernel scrub is somewhat scrap for raid5/6, I'd like to
implement
btrfsck scrub support, at least we can use btrfsck to fix bad stripes
before
kernel fix.



  Why wouldn't you fix in-kernel code?  Why implement duplicate
functionality
when you can fix the root cause?

We'll fix in-kernel code.

Fsck one is not duplicate, we need a better standard thing to compare with
kernel behavior.

Just like qgroup fix in btrfsck, if kernel can't handle something well, we
do need to fix kernel, but a good off-line fixer won't hurt.
(Btrfs-progs is much easier to implement, and get fast review/merge cycle,
and it can help us to find better solution before screwing kernel up again)


I understand some things should go in fsck for comparison. But in this
case I don't see how it can help. Parity is not checksummed. The only
way to know if it's wrong is to read all of the data strips, compute
parity, and compare in-memory parity from current read to on-disk
parity.


That's what we plan to do.
And I don't see the necessary to csum the parity.
Why csum a csum again?

It takes a long time, and at least scrub is online, where
btrfsck scrub is not.

At least btrfsck scrub will work and easier to implement, while kernelscrub doesn't.

The more important thing is, we can forget all about the complicatedconcurrency of online scrub, focusing on the implementation itself atuser-space.

Which is easier to implement and easier to maintain.

There is already an offline scrub in btrfs
check which doesn't repair, but also I don't know if it checks parity.

       --check-data-csum
           verify checksums of data blocks


Just as you expected, it doesn't check parity.

Even for RAID1/DUP, it won't check the backup if it succeeded readingthe first stripe.

Current implement doesn't really care if it's the data or the copycorrupted, any data can be read out, then there is no problem.

The same thing applies to tree blocks.

So the ability to check every stripe/copy is still quite needed for thatoption.

And that's what I'm planning to enhance, make --check-data-csum tokernel scrub equivalent.


           This expects that the filesystem is otherwise OK, so this
is basically and
           offline scrub but does not repair data from spare coipes.

Repair can be implemented, but maybe just rewrite the same data into thesame place.If that's a bad block, then it can't repair further more unless we canrelocate extent to other place.


Is it possible to put parities into their own tree? They'd be
checksummed there.


Personally speaking, this is quite a bad idea to me.
I prefer to separate different logical layers into their own codes.
Not mixing them together.

Block level things to block level(RAID/Chunk), logical thing to logicallevel(tree blocks).


Current btrfs csum design is already much much better than pure RAID.

Just think of RAID1, while one copy is corrupted, then which copy iscorrect then?


Thanks,
Qu

Somehow I think the long term approach is that
partial stripe writes, which apparently are overwrites and not CoW,
need to go away. In particular I wonder what the metadata raid56 write
pattern is, if this usually means a lot of full stripe CoW writes, or
if there are many small metadata RMW changes that makes them partial
stripe writes and not CoW and thus not safe.



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [BUG] Btrfs scrub sometime recalculate wrong parity in raid5

Reply via email to