Hello! On Mon, Dec 04, 2017 at 04:37:50PM +0000, debayang.qdt wrote:
> For some architectures like armv8a - newer GCC generates a full > barrier for the __sync operations compared to the __atomics . > > This is seen to give some performance lag on these architectures > when using __sync compared to the atomics apis under high > contention. > > The C++ atomic ops looks good as well > (http://mailman.nginx.org/pipermail/nginx-devel/2016-September/008805.html), > However I would like to test it out and confirm. > > e.g sync_fetch_add with newer GCC: > > 58: f94007e0 ldr x0, [sp,#8] > 5c: c85f7c01 ldxr x1, [x0] > 60: 91000821 add x1, x1, #0x2 > 64: c802fc01 stlxr w2, x1, [x0] > 68: 35ffffa2 cbnz w2, 5c <testing+0xc> > 6c: d5033bbf dmb ish > > With atomics_fetch_add with SEQ_CST: > > 58: f94007e0 ldr x0, [sp,#8] > 5c: c85ffc01 ldaxr x1, [x0] > 60: 91000821 add x1, x1, #0x2 > 64: c802fc01 stlxr w2, x1, [x0] > 68: 35ffffa2 cbnz w2, 5c <testing+0xc> Well, this may actualy mean that the __atomic and stdatomic variants won't work for us, as it does not seem to imply a barrier protecting other variables. While it may not be important for many uses of ngx_atomic_fetch_add(), it is certainly important for ngx_atomic_cmp_set() we use for shared memory mutexes, where it is assumed to be a full barrier at least for the memory area the mutex protects. (Just for the record, the GCC change in question seems to be documented at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65697.) -- Maxim Dounin http://mdounin.ru/ _______________________________________________ nginx-devel mailing list [email protected] http://mailman.nginx.org/mailman/listinfo/nginx-devel
