Hi David,

On Sat, Oct 23, 2021 at 05:12:18PM +0100, David CARLIER wrote:
> > > diff --git a/include/haproxy/atomic.h b/include/haproxy/atomic.h
> > > index 3198b381a..29a06c57b 100644
> > > --- a/include/haproxy/atomic.h
> > > +++ b/include/haproxy/atomic.h
> > > @@ -698,7 +698,7 @@ __ha_barrier_atomic_full(void)
> > >   */
> > >  #define __ha_cpu_relax() ({ asm volatile("isb" ::: "memory"); 1; })
> > >
> > > -#if defined(__ARM_FEATURE_ATOMICS) // ARMv8.1-A atomics
> > > +#if defined(__ARM_FEATURE_ATOMICS) && !defined(__APPLE__) // ARMv8.1-A 
> > > atomics
> >
> > Hmmm that's not expected at all, what error are you getting ? We're
> > not doing anything special, so if the __ARM_FEATURE_ATOMICS macro
> > is defined, we ought to have them.
> 
> Note this works ok with gcc but not clang. But clang is the main
> compiler, gcc is secondary on this platform (less well supported).
> 
> I tried to convert to llvm asm-ism but seems applying direct
> contiguous data values (ie cmp) to registers seems unsupported and
> requires more instructions/operands thus
> it is safer to just fallback to the builtin which just does that
> anyway as you can see :
> 
> `
> 0x100087a3c <+28>: mov    x3, x5
>     0x100087a40 <+32>: casp   x2, x3, x6, x7, [x8]
>     0x100087a44 <+36>: eor    x8, x3, x5
>     0x100087a48 <+40>: eor    x9, x2, x4
>     0x100087a4c <+44>: orr    x8, x9, x8
>     0x100087a50 <+48>: cbz    x8, 0x100087a58           ; <+56> at
> atomic.h:757:2
> `

I'm not opposed to falling back to the builtin, which is easier to
maintain, my concern remains the condition to enable this, as
"defined(APPLE)" is only about a vendor an not at all about a toolchain.

If you identified that it fails to emit the assembly, it will be the
same for other vendors and the common denominator will be clang/llvm.
And given that it provides a working builtin, then I'd rather rely on
clang detection to use the builtin.

> > > -#elif defined(__SIZEOF_INT128__) && 
> > > defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) // no ARMv8.1-A atomics but 
> > > 128-bit atomics
> > > +#elif defined(__SIZEOF_INT128__) && 
> > > (defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) || defined(__APPLE__)) // 
> > > no ARMv8.1-A atomics but 128-bit atomics
> >
> > I feel uncomfortable with adding this "||" in the expression, as we ought
> > to condition this instructions to this. Are you sure this was needed ? Did
> > you check the resulting code to make sure a CASP instruction was emitted ?
> >
> > Something that would be nice is to dump all defines:
> >
> >   gcc -dM -E - < /dev/null
> >
> > (or use clang instead of gcc).
> 
> mac m1 supports rightfully the 128 bits type.
> 
> `
> #define __SIZEOF_INT128__ 16
> `

No doubt about this, I was mostly referring to all the other ones. In
fact, given that there is the CASP instruction for this, I'd rather
switch to "|| defined(__ARM_FEATURE_ATOMICS)".

> I just recall on arm64 __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16 is not even
> present and indeed when I look up on both on my raspberry and my mac
> m1 that is confirmed.

I think the *_SYNC_* only applies to the old sync_* atomic builtins, and
are probably irrelevant to the new ones, thus we should probably simplify
it this way:

   #elif defined(__SIZEOF_INT128__) && defined(__ARM_FEATURE_ATOMICS)

Willy

Reply via email to