On 2018-01-16 16:12:11 +0900, Michael Paquier wrote: > On Fri, Feb 03, 2017 at 12:26:50AM +0000, Noah Misch wrote: > > On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote: > >> @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile > >> pg_atomic_uint32 *ptr, > >> static inline uint32 > >> pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) > >> { > >> + uint32 ret; > >> + > >> /* > >> - * __fetch_and_add() emits a leading "sync" and trailing "isync", > >> thereby > >> - * providing sequential consistency. This is undocumented. > >> + * Use __sync() before and __isync() after, like in compare-exchange > >> + * above. > >> */ > >> - return __fetch_and_add((volatile int *)&ptr->value, add_); > >> + __sync(); > >> + > >> + ret = __fetch_and_add((volatile int *)&ptr->value, add_); > >> + > >> + __isync(); > >> + > >> + return ret; > >> } > > > > Since this emits double syncs with older xlc, I recommend instead replacing > > the whole thing with inline asm. As I opined in the last message of the > > thread you linked above, the intrinsics provide little value as abstractions > > if one checks the generated code to deduce how to use them. Now that the > > generated code is xlc-version-dependent, the port is better off with > > compiler-independent asm like we have for ppc in s_lock.h. > > Could it be cleaner to just use __xlc_ver__ to avoid double syncs on > past versions? I think that it would make the code more understandable > than just listing directly the instructions.
Given the quality of the intrinsics on AIX, see past commits and the comment in the code quoted above, I think we're much better of doing this via inline asm. Greetings, Andres Freund