On 2021/1/29 下午2:39, Erik Jensen wrote:
On Mon, Jan 25, 2021 at 8:54 PM Erik Jensen <erikjen...@rkjnsn.net> wrote:
On Wed, Jan 20, 2021 at 1:08 AM Erik Jensen <erikjen...@rkjnsn.net> wrote:
On Wed, Jan 20, 2021 at 12:31 AM Qu Wenruo <quwenruo.bt...@gmx.com> wrote:
On 2021/1/20 下午4:21, Qu Wenruo wrote:
On 2021/1/19 下午5:28, Erik Jensen wrote:
On Mon, Jan 18, 2021 at 9:22 PM Erik Jensen <erikjen...@rkjnsn.net>
wrote:
On Mon, Jan 18, 2021 at 4:12 AM Erik Jensen <erikjen...@rkjnsn.net>
wrote:
The offending system is indeed ARMv7 (specifically a Marvell ARMADA®
388), but I believe the Broadcom BCM2835 in my Raspberry Pi is
actually ARMv6 (with hardware float support).
Using NBD, I have verified that I receive the same error when
attempting to mount the filesystem on my ARMv6 Raspberry Pi:
[ 3491.339572] BTRFS info (device dm-4): disk space caching is enabled
[ 3491.394584] BTRFS info (device dm-4): has skinny extents
[ 3492.385095] BTRFS error (device dm-4): bad tree block start, want
26207780683776 have 3395945502747707095
[ 3492.514071] BTRFS error (device dm-4): bad tree block start, want
26207780683776 have 3395945502747707095
[ 3492.553599] BTRFS warning (device dm-4): failed to read tree root
[ 3492.865368] BTRFS error (device dm-4): open_ctree failed
The Raspberry Pi is running Linux 5.4.83.
Okay, after some more testing, ARM seems to be irrelevant, and 32-bit
is the key factor. On a whim, I booted up an i686, 5.8.14 kernel in a
VM, attached the drives via NBD, ran cryptsetup, tried to mount, and…
I got the exact same error message.
My educated guess is on 32bit platforms, we passed incorrect sector into
bio, thus gave us garbage.
To prove that, you can use bcc tool to verify it.
biosnoop can do that:
https://github.com/iovisor/bcc/blob/master/tools/biosnoop_example.txt
Just try mount the fs with biosnoop running.
With "btrfs ins dump-tree -t chunk <dev>", we can manually calculate the
offset of each read to see if they matches.
If not match, it would prove my assumption and give us a pretty good
clue to fix.
Thanks,
Qu
Is this bug happening only on the fs, or any other btrfs can also
trigger similar problems on 32bit platforms?
Thanks,
Qu
I have only observed this error on this file system. Additionally, the
error mounting with the NAS only started after I did a `btrfs replace`
on all five 8TB drives using an x86_64 system. (Ironically, I did this
with the goal of making it faster to use the filesystem on the NAS by
re-encrypting the drives to use a cipher supported by my NAS's crypto
accelerator.)
Maybe this process of shuffling 40TB around caused some value in the
filesystem to increment to the point that a calculation using it
overflows on 32-bit systems?
I should be able to try biosnoop later this week, and I'll report back
with the results.
Okay, I tried running biosnoop, but I seem to be running into this
bug: https://github.com/iovisor/bcc/issues/3241 (That bug was reported
for cpudist, but I'm seeing the same error when I try to run
biosnoop.)
Anything else I can try?
Is it possible to add printks to retrieve the same data?
Sorry for the late reply, busying testing subpage patchset. (And
unfortunately no much process).
If bcc is not possible, you can still use ftrace events, but
unfortunately I didn't find good enough one. (In fact, the trace events
for block layer is pretty limited).
You can try to add printk()s in function blk_account_io_done() to
emulate what's done in function trace_req_completion() of biosnoop.
The time delta is not important, we only need the device name, sector
and length.
Thanks,
Qu