On Thu, Sep 28, 2023 at 10:12:18AM -0400, Brian Foster wrote:
> On Wed, Sep 27, 2023 at 06:08:21PM -0400, Kent Overstreet wrote:
> > On Wed, Sep 27, 2023 at 07:23:37AM -0400, Brian Foster wrote:
> > > An fsstress task on a big endian system (s390x) quickly produces a
> > > bunch of CRC errors in the system logs. Most of these are related to
> > > the narrow CRCs path, but the fundamental problem can be reduced to
> > > a single write and re-read (after dropping caches) of a previously
> > > merged extent.
> > > 
> > > The key merge path that handles extent merges eventually calls into
> > > bch2_checksum_merge() to combine the CRCs of the associated extents.
> > > This code attempts to avoid a byte order swap by feeding the le64
> > > values into the crc32c code, but the latter casts the resulting u64
> > > value down to a u32, which truncates the high bytes where the actual
> > > crc value ends up. This results in a CRC value that does not change
> > > (since it is merged with a CRC of 0), and checksum failures ensue.
> > > 
> > > Fix the checksum merge code to swap to cpu byte order on the
> > > boundaries to the external crc code such that any value casting is
> > > handled properly.
> > 
> > Thanks! Applied.
> > 
> > We really need to test creating a filesystem and then reading from it on
> > an opposite endianness machine, have you gotten a chance to do that?
> > 
> 
> I gave it a quick test by just dd'ing the disk image off my fstests
> TEST_DEV from the BE box I've been playing with and mounting it on a LE
> system. The fs mounts, but eventually complains about a backpointer
> issue after some stress I/O:
> 
>  bcachefs (loop0): error validating btree node at btree backpointers level 0/1
>    u64s 11 type btree_ptr_v2 0:5342578688:0 len 0 ver 0: seq 8574dcb72b17e918 
> written 486 min_key 0:3338403840:1 durability: 1 ptr: 0:10388:0 gen 6
>    node offset 486 bset u64s 1300: invalid bkey: backpointer at wrong pos
>    u64s 9 type backpointer 0:3339255808:0 len 0 ver 0: bucket=0:6369:0 
> btree=extents l=0 offset=0:256 len=64 pos=536913736:256:U32_MAX, shutting down
>  bcachefs (loop0): inconsistency detected - emergency read only
>  bcachefs (loop0): __bch2_btree_write_buffer_flush: insert error EIO
>  bcachefs (loop0 inum 201326618 offset 246272): write error while doing btree 
> update: EIO
> 
> ... and fsck similarly complains about a bunch more bp and lru related
> inconsistencies. Write buffer issue, perhaps? At a glance, that seq
> value looks kind of bogus, but I haven't had a chance to dig into the
> details yet. Everything seems in order with the same image file on the
> BE box, FWIW.

bch_backpointer looks highly suspect re: endianness, if it's not fixable
we'll have to do a bch_backpointer_v2. I expect it will be fixable
though, just tricky.

The LRU btree is just a bitset btree now, so that shouldn't have
endianness issues.

So yeah, we definitely need to get automated foreign endianness testing
going - if there's going to be more of this I don't want us to be doing
this by hand, and we need to make sure issues like this get caught in
the future.

jpsollie was looking at ktest support for big endian architectures
recently and had some patches for that, I just haven't had time to look
at them - jpsollie, do you think you can post those patches to the list?

Reply via email to