Re: [PATCH 2 of 2] Core: add ngx_atomic_store() and ngx_atomic_load()

2016-09-16 Thread Maxim Dounin
Hello!

On Fri, Sep 16, 2016 at 02:43:49PM -0700, Piotr Sikora wrote:

> Hey Maxim,
> 
> > The "*(lock) == 0" check here is just an optimization, it only
> > ensures that the lock is likely to succed.
> 
> Yes, and use of the ngx_atomic_load() doesn't affect that.
> 
> Namely, in the micro-benchmarks I did (heavy contention - 100 threads
> trying to acquire lock, update value, release lock in a loop), there
> is no performance lose while using ngx_atomic_load() on x86_64,
> whereas removing this optimization resulted in 3x worse performance.

The question is: why ngx_atomic_load() is at all needed.  For now 
the only reason I see is "because ThreadSanitizer complains".

> > If the
> > check returns a wrong result due to non-atomic load - this won't
> > do any harm.
> 
> It's not just wrong result but a "data race", which leads to undefined
> behavior (at least according to C++11).

As far as I understand, from C11 point of view undefined behaviour 
is still here even with ngx_atomic_load(), as it's not an atomic 
operation defined by C11.

I can understand introducing load/store as a part of introducing 
C11 atomic operations support.  But I see no reason why it should 
be done separately.

On the other hand, C11 doesn't seem to require explicit use of 
load/store operations, "*(lock)" will work just fine, though 
probably will not be optimal.

Just for the record, below is a quick-and-dirty patch to introduce 
ะก11 atomic operations support (mostly untested).

# HG changeset patch
# User Maxim Dounin 
# Date 1474094645 -10800
#  Sat Sep 17 09:44:05 2016 +0300
# Node ID b79d4fd920eaa056103f68d311452b8cd6da833c
# Parent  9a4934f07bb47fbcf154ce275a04eb5dd1ba16be
C11 atomic operations.

diff --git a/auto/cc/conf b/auto/cc/conf
--- a/auto/cc/conf
+++ b/auto/cc/conf
@@ -178,6 +178,27 @@ if [ "$NGX_PLATFORM" != win32 ]; then
 fi
 
 
+ngx_feature="C11 atomic operations"
+ngx_feature_name=NGX_HAVE_C11_ATOMIC
+ngx_feature_run=yes
+ngx_feature_incs="#include "
+ngx_feature_path=
+ngx_feature_libs=
+ngx_feature_test="#ifndef ATOMIC_LONG_LOCK_FREE
+  #error atomic_long is not lock-free
+  #endif
+  atomic_long  n = ATOMIC_VAR_INIT(0);
+  long tmp = 0;
+  if (!atomic_compare_exchange_strong(&n, &tmp, 1))
+  return 1;
+  if (atomic_fetch_add(&n, 1) != 1)
+  return 1;
+  if (atomic_load(&n) != 2)
+  return 1;
+  atomic_thread_fence(memory_order_acq_rel);"
+. auto/feature
+
+
 ngx_feature="gcc builtin atomic operations"
 ngx_feature_name=NGX_HAVE_GCC_ATOMIC
 ngx_feature_run=yes
diff --git a/src/os/unix/ngx_atomic.h b/src/os/unix/ngx_atomic.h
--- a/src/os/unix/ngx_atomic.h
+++ b/src/os/unix/ngx_atomic.h
@@ -88,6 +88,44 @@ typedef uint32_tngx_
 typedef volatile ngx_atomic_uint_t  ngx_atomic_t;
 
 
+#elif (NGX_HAVE_C11_ATOMIC)
+
+/* C11 atomic operations */
+
+#include 
+
+#define NGX_HAVE_ATOMIC_OPS  1
+
+typedef longngx_atomic_int_t;
+typedef unsigned long   ngx_atomic_uint_t;
+
+#if (NGX_PTR_SIZE == 8)
+#define NGX_ATOMIC_T_LEN(sizeof("-9223372036854775808") - 1)
+#else
+#define NGX_ATOMIC_T_LEN(sizeof("-2147483648") - 1)
+#endif
+
+typedef volatile atomic_ulong   ngx_atomic_t;
+
+ngx_inline ngx_atomic_uint_t
+ngx_atomic_cmp_set(ngx_atomic_t *lock, ngx_atomic_uint_t old,
+ngx_atomic_uint_t set)
+{
+return atomic_compare_exchange_strong(lock, &old, set);
+}
+
+#define ngx_atomic_fetch_add(value, add)  \
+atomic_fetch_add(value, add)
+
+#define ngx_memory_barrier()atomic_thread_fence(memory_order_seq_cst)
+
+#if ( __i386__ || __i386 || __amd64__ || __amd64 )
+#define ngx_cpu_pause() __asm__ ("pause")
+#else
+#define ngx_cpu_pause()
+#endif
+
+
 #elif (NGX_HAVE_GCC_ATOMIC)
 
 /* GCC 4.1 builtin atomic operations */

-- 
Maxim Dounin
http://nginx.org/

___
nginx-devel mailing list
nginx-devel@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx-devel

Re: [PATCH 2 of 2] Core: add ngx_atomic_store() and ngx_atomic_load()

2016-09-16 Thread Piotr Sikora
Hey Maxim,

> The "*(lock) == 0" check here is just an optimization, it only
> ensures that the lock is likely to succed.

Yes, and use of the ngx_atomic_load() doesn't affect that.

Namely, in the micro-benchmarks I did (heavy contention - 100 threads
trying to acquire lock, update value, release lock in a loop), there
is no performance lose while using ngx_atomic_load() on x86_64,
whereas removing this optimization resulted in 3x worse performance.

> If the
> check returns a wrong result due to non-atomic load - this won't
> do any harm.

It's not just wrong result but a "data race", which leads to undefined
behavior (at least according to C++11).

Best regards,
Piotr Sikora

___
nginx-devel mailing list
nginx-devel@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx-devel


Re: [PATCH 2 of 2] Core: add ngx_atomic_store() and ngx_atomic_load()

2016-09-12 Thread Maxim Dounin
Hello!

On Wed, Aug 17, 2016 at 05:29:32PM -0700, Piotr Sikora wrote:

> # HG changeset patch
> # User Piotr Sikora 
> # Date 1471265532 25200
> #  Mon Aug 15 05:52:12 2016 -0700
> # Node ID 40765d8ee4dd29089b0e60ed5b6099ac624e804e
> # Parent  2f2ec92c3af93c11e195fb6d805df57518fede7c
> Core: add ngx_atomic_store() and ngx_atomic_load().
> 
> Those functions must be used to prevent data races between
> threads operating concurrently on the same variables.
> 
> No performance loss measured in microbenchmarks on x86_64.
> 
> No binary changes when compiled without __atomic intrinsics.
> 
> Found with ThreadSanitizer.
> 
> Signed-off-by: Piotr Sikora 

[...]

>  #define ngx_trylock(lock, value) 
>  \
> -(*(lock) == 0 && ngx_atomic_cmp_set(lock, 0, value))
> +(ngx_atomic_load(lock) == 0 && ngx_atomic_cmp_set(lock, 0, value))

The "*(lock) == 0" check here is just an optimization, it only 
ensures that the lock is likely to succed.  Atomicity is provided 
by the ngx_atomic_cmp_set() operation following the check.  If the 
check returns a wrong result due to non-atomic load - this won't 
do any harm.  The idea is that a quick-and-dirty non-atomic 
reading can be used to optimize things when the lock is already 
obtained by another process.  This is especially important in 
spinlocks like in ngx_shmtx_lock().

The same is believed to apply to the other places changed as well.  
If you think there are places where atomic reading is critical - 
please highlight these particular places.

-- 
Maxim Dounin
http://nginx.org/

___
nginx-devel mailing list
nginx-devel@nginx.org
http://mailman.nginx.org/mailman/listinfo/nginx-devel