At 09/21/2016 11:13 PM, Chris Murphy wrote:
On Wed, Sep 21, 2016 at 3:15 AM, Qu Wenruo <> wrote:

At 09/21/2016 03:35 PM, Tomasz Torcz wrote:

On Wed, Sep 21, 2016 at 03:28:25PM +0800, Qu Wenruo wrote:


For this well-known bug, is there any one fixing it?

It can't be more frustrating finding some one has already worked on it
spending days digging.

BTW, since kernel scrub is somewhat scrap for raid5/6, I'd like to
btrfsck scrub support, at least we can use btrfsck to fix bad stripes
kernel fix.

  Why wouldn't you fix in-kernel code?  Why implement duplicate
when you can fix the root cause?

We'll fix in-kernel code.

Fsck one is not duplicate, we need a better standard thing to compare with
kernel behavior.

Just like qgroup fix in btrfsck, if kernel can't handle something well, we
do need to fix kernel, but a good off-line fixer won't hurt.
(Btrfs-progs is much easier to implement, and get fast review/merge cycle,
and it can help us to find better solution before screwing kernel up again)

I understand some things should go in fsck for comparison. But in this
case I don't see how it can help. Parity is not checksummed. The only
way to know if it's wrong is to read all of the data strips, compute
parity, and compare in-memory parity from current read to on-disk

That's what we plan to do.
And I don't see the necessary to csum the parity.
Why csum a csum again?

It takes a long time, and at least scrub is online, where
btrfsck scrub is not.

At least btrfsck scrub will work and easier to implement, while kernel scrub doesn't.

The more important thing is, we can forget all about the complicated concurrency of online scrub, focusing on the implementation itself at user-space.
Which is easier to implement and easier to maintain.

There is already an offline scrub in btrfs
check which doesn't repair, but also I don't know if it checks parity.

           verify checksums of data blocks

Just as you expected, it doesn't check parity.
Even for RAID1/DUP, it won't check the backup if it succeeded reading the first stripe.

Current implement doesn't really care if it's the data or the copy corrupted, any data can be read out, then there is no problem.
The same thing applies to tree blocks.

So the ability to check every stripe/copy is still quite needed for that option.

And that's what I'm planning to enhance, make --check-data-csum to kernel scrub equivalent.

           This expects that the filesystem is otherwise OK, so this
is basically and
           offline scrub but does not repair data from spare coipes.

Repair can be implemented, but maybe just rewrite the same data into the same place. If that's a bad block, then it can't repair further more unless we can relocate extent to other place.

Is it possible to put parities into their own tree? They'd be
checksummed there.

Personally speaking, this is quite a bad idea to me.
I prefer to separate different logical layers into their own codes.
Not mixing them together.

Block level things to block level(RAID/Chunk), logical thing to logical level(tree blocks).

Current btrfs csum design is already much much better than pure RAID.
Just think of RAID1, while one copy is corrupted, then which copy is correct then?


Somehow I think the long term approach is that
partial stripe writes, which apparently are overwrites and not CoW,
need to go away. In particular I wonder what the metadata raid56 write
pattern is, if this usually means a lot of full stripe CoW writes, or
if there are many small metadata RMW changes that makes them partial
stripe writes and not CoW and thus not safe.

To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to
More majordomo info at

Reply via email to