On Wed, Feb 3, 2021 at 10:16 PM Erik Jensen <erikjen...@rkjnsn.net> wrote: > On Sun, Jan 31, 2021 at 9:50 PM Su Yue <l...@damenly.su> wrote: > > On Mon 01 Feb 2021 at 10:35, Qu Wenruo <quwenruo.bt...@gmx.com> > > wrote: > > > On 2021/1/29 下午2:39, Erik Jensen wrote: > > >> On Mon, Jan 25, 2021 at 8:54 PM Erik Jensen > > >> <erikjen...@rkjnsn.net> wrote: > > >>> On Wed, Jan 20, 2021 at 1:08 AM Erik Jensen > > >>> <erikjen...@rkjnsn.net> wrote: > > >>>> On Wed, Jan 20, 2021 at 12:31 AM Qu Wenruo > > >>>> <quwenruo.bt...@gmx.com> wrote: > > >>>>> On 2021/1/20 下午4:21, Qu Wenruo wrote: > > >>>>>> On 2021/1/19 下午5:28, Erik Jensen wrote: > > >>>>>>> On Mon, Jan 18, 2021 at 9:22 PM Erik Jensen > > >>>>>>> <erikjen...@rkjnsn.net> > > >>>>>>> wrote: > > >>>>>>>> > > >>>>>>>> On Mon, Jan 18, 2021 at 4:12 AM Erik Jensen > > >>>>>>>> <erikjen...@rkjnsn.net> > > >>>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>> The offending system is indeed ARMv7 (specifically a > > >>>>>>>>> Marvell ARMADA® > > >>>>>>>>> 388), but I believe the Broadcom BCM2835 in my Raspberry > > >>>>>>>>> Pi is > > >>>>>>>>> actually ARMv6 (with hardware float support). > > >>>>>>>> > > >>>>>>>> Using NBD, I have verified that I receive the same error > > >>>>>>>> when > > >>>>>>>> attempting to mount the filesystem on my ARMv6 Raspberry > > >>>>>>>> Pi: > > >>>>>>>> [ 3491.339572] BTRFS info (device dm-4): disk space > > >>>>>>>> caching is enabled > > >>>>>>>> [ 3491.394584] BTRFS info (device dm-4): has skinny > > >>>>>>>> extents > > >>>>>>>> [ 3492.385095] BTRFS error (device dm-4): bad tree block > > >>>>>>>> start, want > > >>>>>>>> 26207780683776 have 3395945502747707095 > > >>>>>>>> [ 3492.514071] BTRFS error (device dm-4): bad tree block > > >>>>>>>> start, want > > >>>>>>>> 26207780683776 have 3395945502747707095 > > >>>>>>>> [ 3492.553599] BTRFS warning (device dm-4): failed to > > >>>>>>>> read tree root > > >>>>>>>> [ 3492.865368] BTRFS error (device dm-4): open_ctree > > >>>>>>>> failed > > >>>>>>>> > > >>>>>>>> The Raspberry Pi is running Linux 5.4.83. > > >>>>>>>> > > >>>>>>> > > >>>>>>> Okay, after some more testing, ARM seems to be irrelevant, > > >>>>>>> and 32-bit > > >>>>>>> is the key factor. On a whim, I booted up an i686, 5.8.14 > > >>>>>>> kernel in a > > >>>>>>> VM, attached the drives via NBD, ran cryptsetup, tried to > > >>>>>>> mount, and… > > >>>>>>> I got the exact same error message. > > >>>>>>> > > >>>>>> My educated guess is on 32bit platforms, we passed > > >>>>>> incorrect sector into > > >>>>>> bio, thus gave us garbage. > > >>>>> > > >>>>> To prove that, you can use bcc tool to verify it. > > >>>>> biosnoop can do that: > > >>>>> https://github.com/iovisor/bcc/blob/master/tools/biosnoop_example.txt > > >>>>> > > >>>>> Just try mount the fs with biosnoop running. > > >>>>> With "btrfs ins dump-tree -t chunk <dev>", we can manually > > >>>>> calculate the > > >>>>> offset of each read to see if they matches. > > >>>>> If not match, it would prove my assumption and give us a > > >>>>> pretty good > > >>>>> clue to fix. > > >>>>> > > >>>>> Thanks, > > >>>>> Qu > > >>>>> > > >>>>>> > > >>>>>> Is this bug happening only on the fs, or any other btrfs > > >>>>>> can also > > >>>>>> trigger similar problems on 32bit platforms? > > >>>>>> > > >>>>>> Thanks, > > >>>>>> Qu > > >>>> > > >>>> I have only observed this error on this file system. > > >>>> Additionally, the > > >>>> error mounting with the NAS only started after I did a `btrfs > > >>>> replace` > > >>>> on all five 8TB drives using an x86_64 system. (Ironically, I > > >>>> did this > > >>>> with the goal of making it faster to use the filesystem on > > >>>> the NAS by > > >>>> re-encrypting the drives to use a cipher supported by my > > >>>> NAS's crypto > > >>>> accelerator.) > > >>>> > > >>>> Maybe this process of shuffling 40TB around caused some value > > >>>> in the > > >>>> filesystem to increment to the point that a calculation using > > >>>> it > > >>>> overflows on 32-bit systems? > > >>>> > > >>>> I should be able to try biosnoop later this week, and I'll > > >>>> report back > > >>>> with the results. > > >>> > > >>> Okay, I tried running biosnoop, but I seem to be running into > > >>> this > > >>> bug: https://github.com/iovisor/bcc/issues/3241 (That bug was > > >>> reported > > >>> for cpudist, but I'm seeing the same error when I try to run > > >>> biosnoop.) > > >>> > > >>> Anything else I can try? > > >> > > >> Is it possible to add printks to retrieve the same data? > > >> > > > Sorry for the late reply, busying testing subpage patchset. (And > > > unfortunately no much process). > > > > > > If bcc is not possible, you can still use ftrace events, but > > > unfortunately I didn't find good enough one. (In fact, the trace > > > events > > > for block layer is pretty limited). > > > > > > You can try to add printk()s in function blk_account_io_done() > > > to > > > emulate what's done in function trace_req_completion() of > > > biosnoop. > > > > > > The time delta is not important, we only need the device name, > > > sector > > > and length. > > > > > > > Tips: There are ftrace events called block:block_rq_issue and > > block:block_rq_complete to fetch those infomation. No need to > > add printk(). > > > > > > > > Thanks, > > > Qu > > > > Okay, here's the output of the trace: > https://gist.github.com/rkjnsn/4cf606874962b5a0284249b2f2e934f5 > > And here's the output dump-tree: > https://gist.github.com/rkjnsn/630b558eaf90369478d670a1cb54b40f > > One important note is that ftrace only captured requests at the > underlying block device (nbd, in this case), not at the device mapper > level. The encryption header on these drives is 16 MiB, so the offset > reported in the trace will be 16777216 bytes larger than the offset > brtfs was actually trying to read at the time. > > In case it's helpful, I believe this is the mapping of which > (encrypted) nbd device node in the trace corresponds to which > (decrypted) filesystem device: > 43,0 33c75e20-26f2-4328-a565-5ef3484832aa > 43,32 9bdfdb8f-abfb-47c5-90af-d360d754a958 > 43,64 39a9463d-65f5-499b-bca8-dae6b52eb729 > 43,96 f1174dea-ea10-42f2-96b4-4589a2980684 > 43,128 e669d804-6ea2-4516-8536-1d266f88ebad
What are the chances it's something simple like a long getting used somewhere in the code that should actually be a 64-bit int?