On Mon, Jan 25, 2021 at 8:54 PM Erik Jensen <erikjen...@rkjnsn.net> wrote: > On Wed, Jan 20, 2021 at 1:08 AM Erik Jensen <erikjen...@rkjnsn.net> wrote: > > On Wed, Jan 20, 2021 at 12:31 AM Qu Wenruo <quwenruo.bt...@gmx.com> wrote: > > > On 2021/1/20 下午4:21, Qu Wenruo wrote: > > > > On 2021/1/19 下午5:28, Erik Jensen wrote: > > > >> On Mon, Jan 18, 2021 at 9:22 PM Erik Jensen <erikjen...@rkjnsn.net> > > > >> wrote: > > > >>> > > > >>> On Mon, Jan 18, 2021 at 4:12 AM Erik Jensen <erikjen...@rkjnsn.net> > > > >>> wrote: > > > >>>> > > > >>>> The offending system is indeed ARMv7 (specifically a Marvell ARMADA® > > > >>>> 388), but I believe the Broadcom BCM2835 in my Raspberry Pi is > > > >>>> actually ARMv6 (with hardware float support). > > > >>> > > > >>> Using NBD, I have verified that I receive the same error when > > > >>> attempting to mount the filesystem on my ARMv6 Raspberry Pi: > > > >>> [ 3491.339572] BTRFS info (device dm-4): disk space caching is enabled > > > >>> [ 3491.394584] BTRFS info (device dm-4): has skinny extents > > > >>> [ 3492.385095] BTRFS error (device dm-4): bad tree block start, want > > > >>> 26207780683776 have 3395945502747707095 > > > >>> [ 3492.514071] BTRFS error (device dm-4): bad tree block start, want > > > >>> 26207780683776 have 3395945502747707095 > > > >>> [ 3492.553599] BTRFS warning (device dm-4): failed to read tree root > > > >>> [ 3492.865368] BTRFS error (device dm-4): open_ctree failed > > > >>> > > > >>> The Raspberry Pi is running Linux 5.4.83. > > > >>> > > > >> > > > >> Okay, after some more testing, ARM seems to be irrelevant, and 32-bit > > > >> is the key factor. On a whim, I booted up an i686, 5.8.14 kernel in a > > > >> VM, attached the drives via NBD, ran cryptsetup, tried to mount, and… > > > >> I got the exact same error message. > > > >> > > > > My educated guess is on 32bit platforms, we passed incorrect sector into > > > > bio, thus gave us garbage. > > > > > > To prove that, you can use bcc tool to verify it. > > > biosnoop can do that: > > > https://github.com/iovisor/bcc/blob/master/tools/biosnoop_example.txt > > > > > > Just try mount the fs with biosnoop running. > > > With "btrfs ins dump-tree -t chunk <dev>", we can manually calculate the > > > offset of each read to see if they matches. > > > If not match, it would prove my assumption and give us a pretty good > > > clue to fix. > > > > > > Thanks, > > > Qu > > > > > > > > > > > Is this bug happening only on the fs, or any other btrfs can also > > > > trigger similar problems on 32bit platforms? > > > > > > > > Thanks, > > > > Qu > > > > I have only observed this error on this file system. Additionally, the > > error mounting with the NAS only started after I did a `btrfs replace` > > on all five 8TB drives using an x86_64 system. (Ironically, I did this > > with the goal of making it faster to use the filesystem on the NAS by > > re-encrypting the drives to use a cipher supported by my NAS's crypto > > accelerator.) > > > > Maybe this process of shuffling 40TB around caused some value in the > > filesystem to increment to the point that a calculation using it > > overflows on 32-bit systems? > > > > I should be able to try biosnoop later this week, and I'll report back > > with the results. > > Okay, I tried running biosnoop, but I seem to be running into this > bug: https://github.com/iovisor/bcc/issues/3241 (That bug was reported > for cpudist, but I'm seeing the same error when I try to run > biosnoop.) > > Anything else I can try?
Is it possible to add printks to retrieve the same data?