On 2021/2/11 上午6:17, Erik Jensen wrote:
On Tue, Feb 9, 2021 at 9:47 PM Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
[...]

Unfortunately I didn't get much useful info from the trace events.
As a lot of the values doesn't even make sense to me....

But the chunk tree dump proves to be more useful.

Firstly, the offending tree block doesn't even occur in chunk chunk ranges.

The offending tree block is 26207780683776, but the tree dump doesn't
have any range there.

The highest chunk is at 5958289850368 + 4294967296, still one digit
lower than the expected value.

I'm surprised we didn't even get any error for that, thus it may
indicate our chunk mapping is incorrect too.

Would you please try the following diff on the 32bit system and report
back the dmesg?

The diff adds the following debug output:
- when we try to read one tree block
- when a bio is mapped to read device
- when a new chunk is added to chunk tree

Thanks,
Qu

Okay, here's the dmesg output from attempting to mount the filesystem:
https://gist.github.com/rkjnsn/914651efdca53c83199029de6bb61e20

I captured this on my 32-bit x86 VM, as it's much faster to rebuild
the kernel there than on my ARM board, and it fails with the same
error.


This is indeed much better.

The involved things are:

[   84.463147] read_one_chunk: chunk start=26207148048384 len=1073741824
num_stripes=2 type=0x14
[   84.463148] read_one_chunk:    stripe 0 phy=6477927415808 devid=5
[   84.463149] read_one_chunk:    stripe 1 phy=6477927415808 devid=4

Above is the chunk for the offending tree block.

[   84.463724] read_extent_buffer_pages: eb->start=26207780683776 mirror=0
[   84.463731] submit_stripe_bio: rw 0 0x1000, phy=2118735708160
sector=4138155680 dev_id=3 size=16384
[   84.470793] BTRFS error (device dm-4): bad tree block start, want
26207780683776 have 3395945502747707095

But when the metadata read happens, the physical address and dev id is
completely insane.

The chunk doesn't have dev 3 in it at all, but we still get the wrong
mapping.

Furthermore, that physical and devid belongs to chunk 8614760677376,
which is raid5 data chunk.

So there is definitely something wrong in btrfs chunk mapping on 32bit.

I'll craft a newer debug diff for you after I pinned down which can be
wrong.

Thanks,
Qu

Reply via email to