Thank you very much to Arnd Bergmann and Wookey for the assistance
off-list.  Arnd's explanation of the root cause and the patch required
is reproduced below.  I suspect others will also find it helpful.

On Sun, Apr 14, 2024 at 5:33 AM Arnd Bergmann <a...@arndb.de> wrote:
> This is the same problem that any application has when using
> syscall() with libc-provided data structures. In general, one
> should use libc syscall wrappers and libc data structures, but
> there is no such wrapper for futex, so one should use the kernel
> structures instead: __kernel_old_timespec for __NR_futex
> and __kernel_timespec for __NR_futex_time64.
> 
> The bug here is in
> https://sources.debian.org/src/capnproto/1.0.1-3/src/kj/mutex.c%2B%2B/#L40
> which contains the incorrect
> 
> #ifndef SYS_futex
> // Missing on Android/Bionic.
> #ifdef __NR_futex
> #define SYS_futex __NR_futex
> #elif defined(SYS_futex_time64)
> #define SYS_futex SYS_futex_time64
> #else
> #error "Need working SYS_futex"
> #endif
> #endif
> 
> This above only works on musl because it has no __NR_futex
> definition for 32-bit, while the kernel headers used
> by glibc contain both __NR_futex and __NR_futex_time64.
> 
> The kernel timespec definitions are intentionally done to
> be compatible with the libc ones, but you have to use the
> correct one. There are many ways to write this, depending
> how many corner cases you want to handle. This version
> should work on any Linux kernel after 5.6 and any glibc
> and musl libc:
> 
> #if defined(__USE_TIME_BITS64) && (__BITS_PER_LONG == 32)
> #define MY_FUTEX __NR_futex_time64
> #else
> #define MY_FUTEX __NR_futex
> #endif
> 
> Alternatively (avoiding the dependency on libc macros)
> you can use
> 
> #if __BITS_PER_LONG == 32
> #define MY_FUTEX (sizeof(long) < sizeof(time_t)) ? \
>            __NR_futex_time64 : __NR_futex
> #else
> #define MY_FUTEX __NR_futex
> #define
> 
> > And now I'm well beyond my depth, but does it make sense that
> > timespec_tz_nsec is still 4 bytes after the t64 transtition?  I get it
> > that it's supposed to represent up to 10^9 fractional seconds and thus
> > can fit into 32-bits, but maybe the optimization isn't worth the
> > discrepancy with 64-bit userspace?
> >
> > armhf before the t64 transition:
> >
> > Size of timespec.tz_sec: 4 byte
> > Size of timespec.tz_nsec: 4 byte
> >
> > armhf after the t64 transition:
> >
> > Size of timespec.tz_sec: 8 byte
> > Size of timespec.tz_nsec: 4 byte
> >
> > 64-bit architectures:
> >
> > Size of timespec.tz_sec: 8 byte
> > Size of timespec.tz_nsec: 8 byte
> 
> The kernel interfaces are defined with an 8 byte tv_nsec,
> same as the 64-bit ones for any data returned by the kernel:
> 
> struct __kernel_timespec {
>         __kernel_time64_t       tv_sec;                 /* seconds */
>         long long               tv_nsec;                /* nanoseconds */
> };
> 
> The libc definitions on the other hand use a POSIX and
> C99 compliant definition with a 'long tv_nsec':
> 
> struct timespec
> {
> #ifdef __USE_TIME_BITS64
>   __time64_t tv_sec;            /* Seconds.  */
> #else
>   __time_t tv_sec;              /* Seconds.  */
> #endif
> #if __WORDSIZE == 64 \
>   || (defined __SYSCALL_WORDSIZE && __SYSCALL_WORDSIZE == 64) \
>   || (__TIMESIZE == 32 && !defined __USE_TIME_BITS64)
>   __syscall_slong_t tv_nsec;    /* Nanoseconds.  */
> #else
> # if __BYTE_ORDER == __BIG_ENDIAN
>   int: 32;           /* Padding.  */
>   long int tv_nsec;  /* Nanoseconds.  */
> # else
>   long int tv_nsec;  /* Nanoseconds.  */
>   int: 32;           /* Padding.  */
> # endif
> #endif
> };
> 
> All the complexity in there is done to ensure that the two
> structures put the nanoseconds in the same bits as the kernel
> while still keeping the type of the userspace tv_nsec member
> 'long int'. The kernel ignores the top 32 bits of tv_nsec
> when called from a 32-bit process but produces an error
> when any of those bits are set when called from a 64-bit
> process.

Attachment: signature.asc
Description: PGP signature

Reply via email to