On Fri, Jul 20, 2018 at 10:27:42AM +0200, Christopher Faulet wrote:
> In thread_sync_barrier, we exit when all threads have set their own bit in the
> barrier mask. It is done by comparing it to all_threads_mask. But we must not
> use a simple equality to do so, becaue all_threads_mask may change. Since 
> commit
> ba86c6c25 ("MINOR: threads: Be sure to remove threads from all_threads_mask on
> exit"), when a thread exit, its bit is removed from all_threads_mask. Instead,
> we must use a bitwise AND to test is all bits of all_threads_mask are set.

I agree on the principle, however it may still not work because
all_threads_mask is not marked volatile, so the compiler will most of
the time fetch the value into a register and will loop on it, keeping
the wrong value all the time. The case where it matters is when the
all_threads_mask is fetched and cached when entering the function,
then loses the value while barrier still has this bit set.

Thus I think that adding only this to your patch will definitely kill
this bug :

diff --git a/include/common/hathreads.h b/include/common/hathreads.h
index 5c4ceca..274f988 100644
--- a/include/common/hathreads.h
+++ b/include/common/hathreads.h
@@ -260,7 +260,7 @@ void thread_exit_sync(void);
 int  thread_no_sync(void);
 int  thread_need_sync(void);
 
-extern unsigned long all_threads_mask;
+extern volatile unsigned long all_threads_mask;
 
 #define ha_sigmask(how, set, oldset)  pthread_sigmask(how, set, oldset)
 
diff --git a/src/hathreads.c b/src/hathreads.c
index 5db3c21..aada8ed 100644
--- a/src/hathreads.c
+++ b/src/hathreads.c
@@ -31,7 +31,7 @@ void thread_sync_io_handler(int fd)
 static HA_SPINLOCK_T sync_lock;
 static int           threads_sync_pipe[2];
 static unsigned long threads_want_sync = 0;
-unsigned long all_threads_mask  = 0;
+volatile unsigned long all_threads_mask  = 0;
 
 #if defined(DEBUG_THREAD) || defined(DEBUG_FULL)
 struct lock_stat lock_stats[LOCK_LABELS];

Pieter, your feedback is welcome, as usual :-)

Don't bother respinning the patch Christopher, I can add this myself
after Pieter's test.

Cheers,
Willy

Reply via email to