On Tue, Sep 19, 2023 at 10:26:11AM -0400, Brian Foster wrote:
> bcachefs format on a big-endian (s390x) machine crashes down in the
> rhashtable code imported from the kernel. The reason this occurs
> lies within the rht_lock() -> bit_spin_lock() code, the latter of
> which casts bitmaps down to 32-bits to satisfy the requirements of
> the futex interface.
> 
> The specific problem here is that a 64 -> 32 bit cast doesn't refer
> to the lower 8 bytes on a big endian machine, which means setting
> bit number 0 in the 32-bit map actually corresponds to bit 32 in the
> 64-bit map. The rhashtable code specifically uses bit zero of the
> bucket pointer for exclusion and uses native bitops elsewhere (i.e.
> __rht_ptr()) to identify NULL pointers. If bit 32 of the pointer is
> set by the locking code instead of bit 0, an otherwise NULL pointer
> looks like a non-NULL value and results in a segfault.
> 
> The bit spinlock code imported by the kernel is originally intended
> to work with unsigned long. The kernel code is able to throttle the
> cpu directly when under lock contention, while the userspace
> implementation relies on the futex primitives to emulate reliable
> blocking. Since the futex interface introduces the 32-bit
> requirement, isolate the associated userspace hack to that
> particular code.
> 
> Restore the native bitmap behavior of the bit spinlock code to
> address the rhashtable problem described above. Since this is not
> compatible with the futex interface, create a futex wrapper
> specifically to convert the native bitmap type to a 32-bit virtual
> address and mask value for the purposes of waiting/waking when under
> lock contention.
> 
> Signed-off-by: Brian Foster <[email protected]>

Good find :)

Thanks, applied

Reply via email to