Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?

Nik. Mon, 08 Apr 2019 14:23:03 -0700



2019-04-08 15:09, Qu Wenruo:

Unfortunately, I didn't receive the last mail from Nik.

So I'm using the content from lore.kernel.org.

[122293.101782] BTRFS critical (device md0): corrupt leaf: root=1
block=1894009225216 slot=82, bad key order, prev (2034321682432 168
262144) current (2034318143488 168 962560)

Root=1 means it's root tree, 168 means EXTENT_ITEM, which should only
occur in extent tree, not in root tree.

This means either the leaf owner, or some tree blocks get totally
screwed up.

This is not easy to fix, if possible.

Would you please try this kernel branch and mount it with
"rescue=skip_bg,ro"?
https://github.com/adam900710/linux/tree/rescue_options

I think that's the last method. Before that, you could try
btrfs-restore, which is purely user-space and should be easier to setup
than custom kernel.

Thanks,
Qu

Tried "btrfs restore -vsxmi ..." (it did not work before my firstemail), it is processing for at least 6 hours until now. It seems thatdespite many error messages files are getting restored. As soon as itfinishes will check what is the result and give feedback. Will also testthe mentioned kernel branch.


Kind regards,
Nik.
--


On 2019/4/8 上午2:45, Chris Murphy wrote:

On Sun, Apr 7, 2019 at 1:42 AM Nik. <bt...@avgustinov.eu> wrote:

2019-04-07 01:18, Qu Wenruo:

You have 2 bits flipped just in one tree block!

If the data-tree structures alone have so many bits flipped, how much
flipped bits are to be expected in the data itself? What should a normal
btrfs user do in order to prevent such disasters?


I think the corruption in your case is inferred by Btrfs only by bad
key ordering, not csum failure for the leaf? I can't tell for sure
from the error, but I don't see a csum complaint.

I'd expect a RAM caused corruption could affect a metadata leaf data,
followed by csum computation. Therefore no csum failure on subsequent
read. Whereas if the corruption is storage stack related, we'd see a
csum error on subsequent read.

Once there's corruption in a block address, the corruption can
propagate into anything else that depends on that block address even
if there isn't another corruption event. So one event, multiple
corruptions.

And another thing: if I am getting it right, it should have been more
reliable/appropriate to let btrfs manage the five disks behind the md0
with a raid1 profile instead binding them in a RAID5 and "giving" just a
single device to btrfs.


Not necessarily. If corruption happens early enough, it gets baked
into all copies of the metadata.

Re: interest in post-mortem examination of a BTRFS system and improving the btrfs-code?

Reply via email to