On Wed, Feb 3, 2021 at 10:16 PM Erik Jensen <erikjen...@rkjnsn.net> wrote:
> On Sun, Jan 31, 2021 at 9:50 PM Su Yue <l...@damenly.su> wrote:
> > On Mon 01 Feb 2021 at 10:35, Qu Wenruo <quwenruo.bt...@gmx.com>
> > wrote:
> > > On 2021/1/29 下午2:39, Erik Jensen wrote:
> > >> On Mon, Jan 25, 2021 at 8:54 PM Erik Jensen
> > >> <erikjen...@rkjnsn.net> wrote:
> > >>> On Wed, Jan 20, 2021 at 1:08 AM Erik Jensen
> > >>> <erikjen...@rkjnsn.net> wrote:
> > >>>> On Wed, Jan 20, 2021 at 12:31 AM Qu Wenruo
> > >>>> <quwenruo.bt...@gmx.com> wrote:
> > >>>>> On 2021/1/20 下午4:21, Qu Wenruo wrote:
> > >>>>>> On 2021/1/19 下午5:28, Erik Jensen wrote:
> > >>>>>>> On Mon, Jan 18, 2021 at 9:22 PM Erik Jensen
> > >>>>>>> <erikjen...@rkjnsn.net>
> > >>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>> On Mon, Jan 18, 2021 at 4:12 AM Erik Jensen
> > >>>>>>>> <erikjen...@rkjnsn.net>
> > >>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>> The offending system is indeed ARMv7 (specifically a
> > >>>>>>>>> Marvell ARMADA®
> > >>>>>>>>> 388), but I believe the Broadcom BCM2835 in my Raspberry
> > >>>>>>>>> Pi is
> > >>>>>>>>> actually ARMv6 (with hardware float support).
> > >>>>>>>>
> > >>>>>>>> Using NBD, I have verified that I receive the same error
> > >>>>>>>> when
> > >>>>>>>> attempting to mount the filesystem on my ARMv6 Raspberry
> > >>>>>>>> Pi:
> > >>>>>>>> [ 3491.339572] BTRFS info (device dm-4): disk space
> > >>>>>>>> caching is enabled
> > >>>>>>>> [ 3491.394584] BTRFS info (device dm-4): has skinny
> > >>>>>>>> extents
> > >>>>>>>> [ 3492.385095] BTRFS error (device dm-4): bad tree block
> > >>>>>>>> start, want
> > >>>>>>>> 26207780683776 have 3395945502747707095
> > >>>>>>>> [ 3492.514071] BTRFS error (device dm-4): bad tree block
> > >>>>>>>> start, want
> > >>>>>>>> 26207780683776 have 3395945502747707095
> > >>>>>>>> [ 3492.553599] BTRFS warning (device dm-4): failed to
> > >>>>>>>> read tree root
> > >>>>>>>> [ 3492.865368] BTRFS error (device dm-4): open_ctree
> > >>>>>>>> failed
> > >>>>>>>>
> > >>>>>>>> The Raspberry Pi is running Linux 5.4.83.
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>> Okay, after some more testing, ARM seems to be irrelevant,
> > >>>>>>> and 32-bit
> > >>>>>>> is the key factor. On a whim, I booted up an i686, 5.8.14
> > >>>>>>> kernel in a
> > >>>>>>> VM, attached the drives via NBD, ran cryptsetup, tried to
> > >>>>>>> mount, and…
> > >>>>>>> I got the exact same error message.
> > >>>>>>>
> > >>>>>> My educated guess is on 32bit platforms, we passed
> > >>>>>> incorrect sector into
> > >>>>>> bio, thus gave us garbage.
> > >>>>>
> > >>>>> To prove that, you can use bcc tool to verify it.
> > >>>>> biosnoop can do that:
> > >>>>> https://github.com/iovisor/bcc/blob/master/tools/biosnoop_example.txt
> > >>>>>
> > >>>>> Just try mount the fs with biosnoop running.
> > >>>>> With "btrfs ins dump-tree -t chunk <dev>", we can manually
> > >>>>> calculate the
> > >>>>> offset of each read to see if they matches.
> > >>>>> If not match, it would prove my assumption and give us a
> > >>>>> pretty good
> > >>>>> clue to fix.
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Qu
> > >>>>>
> > >>>>>>
> > >>>>>> Is this bug happening only on the fs, or any other btrfs
> > >>>>>> can also
> > >>>>>> trigger similar problems on 32bit platforms?
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Qu
> > >>>>
> > >>>> I have only observed this error on this file system.
> > >>>> Additionally, the
> > >>>> error mounting with the NAS only started after I did a `btrfs
> > >>>> replace`
> > >>>> on all five 8TB drives using an x86_64 system. (Ironically, I
> > >>>> did this
> > >>>> with the goal of making it faster to use the filesystem on
> > >>>> the NAS by
> > >>>> re-encrypting the drives to use a cipher supported by my
> > >>>> NAS's crypto
> > >>>> accelerator.)
> > >>>>
> > >>>> Maybe this process of shuffling 40TB around caused some value
> > >>>> in the
> > >>>> filesystem to increment to the point that a calculation using
> > >>>> it
> > >>>> overflows on 32-bit systems?
> > >>>>
> > >>>> I should be able to try biosnoop later this week, and I'll
> > >>>> report back
> > >>>> with the results.
> > >>>
> > >>> Okay, I tried running biosnoop, but I seem to be running into
> > >>> this
> > >>> bug: https://github.com/iovisor/bcc/issues/3241 (That bug was
> > >>> reported
> > >>> for cpudist, but I'm seeing the same error when I try to run
> > >>> biosnoop.)
> > >>>
> > >>> Anything else I can try?
> > >>
> > >> Is it possible to add printks to retrieve the same data?
> > >>
> > > Sorry for the late reply, busying testing subpage patchset. (And
> > > unfortunately no much process).
> > >
> > > If bcc is not possible, you can still use ftrace events, but
> > > unfortunately I didn't find good enough one. (In fact, the trace
> > > events
> > > for block layer is pretty limited).
> > >
> > > You can try to add printk()s in function blk_account_io_done()
> > > to
> > > emulate what's done in function trace_req_completion() of
> > > biosnoop.
> > >
> > > The time delta is not important, we only need the device name,
> > > sector
> > > and length.
> > >
> >
> > Tips: There are ftrace events called block:block_rq_issue and
> > block:block_rq_complete to fetch those infomation. No need to
> > add printk().
> >
> > >
> > > Thanks,
> > > Qu
> >
>
> Okay, here's the output of the trace:
> https://gist.github.com/rkjnsn/4cf606874962b5a0284249b2f2e934f5
>
> And here's the output dump-tree:
> https://gist.github.com/rkjnsn/630b558eaf90369478d670a1cb54b40f
>
> One important note is that ftrace only captured requests at the
> underlying block device (nbd, in this case), not at the device mapper
> level. The encryption header on these drives is 16 MiB, so the offset
> reported in the trace will be 16777216 bytes larger than the offset
> brtfs was actually trying to read at the time.
>
> In case it's helpful, I believe this is the mapping of which
> (encrypted) nbd device node in the trace corresponds to which
> (decrypted) filesystem device:
> 43,0    33c75e20-26f2-4328-a565-5ef3484832aa
> 43,32   9bdfdb8f-abfb-47c5-90af-d360d754a958
> 43,64   39a9463d-65f5-499b-bca8-dae6b52eb729
> 43,96   f1174dea-ea10-42f2-96b4-4589a2980684
> 43,128  e669d804-6ea2-4516-8536-1d266f88ebad

What are the chances it's something simple like a long getting used
somewhere in the code that should actually be a 64-bit int?

Reply via email to