On 04/23/2012 05:03 PM, Dalleau, Frederic wrote:
Hi David,

On Fri, Apr 20, 2012 at 4:44 PM, David Henningsson
<[email protected]>  wrote:
While researching a bug I came across something that might be a bug in the
pa_once logic, but this stuff is tricky, so I might also be missing
something.

Imagine this:

  * Thread 1 runs pa_once_begin, succeeds and starts running the payload (i e
the code that should only run once).
  * Thread 2 starts running pa_once_begin, but only the first row. We're now
right *before* pa_atomic_inc(&control->ref) but *after*
pa_atomic_load(&control->done).
  * Thread 1 finishes the payload, runs pa_once_done which sets control->done
and frees the mutex.
  * Thread 2 continues, pa_once_begin succeeds and the payload is now run a
second time!


After reading your mail, I made some experiments by adding a usleep() call
in Thread 1 between pa_atomic_load(&control->done) and
pa_atomic_inc(&control->ref)
and that failed once-test 100% of time.

I reverted the usleep and made another experiment using 50000 iterations in
once-test and it just failed.

Good catch !

I tried to look up implementation/algorithm suggestions, and for the ones I found [1], there was no freeing of the mutex. Without freeing, the code becomes simpler. The attached patch is a version of that. I've just tried a simple once-test (but do feel free to run it 50000 times :-) ).

But of course, now we leak a mutex. But that's what we already do with the static mutexes we use in a few places already, so maybe it doesn't matter much?

--
David Henningsson, Canonical Ltd.
https://launchpad.net/~diwic

[1] e g http://en.wikipedia.org/wiki/Double-checked_locking
>From e321796a5b6d0d2087fc59ecbe60159f9c71cdbd Mon Sep 17 00:00:00 2001
From: David Henningsson <[email protected]>
Date: Tue, 24 Apr 2012 14:44:02 +0200
Subject: [PATCH] once: Fix race in implementation

TODO: Write a nice commit message

Signed-off-by: David Henningsson <[email protected]>
---
 src/pulsecore/once.c |   54 +++++++++++++++++++++++---------------------------
 src/pulsecore/once.h |    5 ++---
 2 files changed, 27 insertions(+), 32 deletions(-)

diff --git a/src/pulsecore/once.c b/src/pulsecore/once.c
index 4e509e0..4be590e 100644
--- a/src/pulsecore/once.c
+++ b/src/pulsecore/once.c
@@ -28,42 +28,43 @@
 
 #include "once.h"
 
+#define PA_ONCE_STATE_INIT 0
+#define PA_ONCE_STATE_RUNNING 1
+#define PA_ONCE_STATE_DONE 2
+
 pa_bool_t pa_once_begin(pa_once *control) {
+    pa_mutex *m;
+
     pa_assert(control);
 
-    if (pa_atomic_load(&control->done))
+    if (pa_atomic_load(&control->state) == PA_ONCE_STATE_DONE)
         return FALSE;
 
-    pa_atomic_inc(&control->ref);
+    if (pa_atomic_cmpxchg(&control->state, PA_ONCE_STATE_INIT, PA_ONCE_STATE_RUNNING)) {
+        /* Ok, we're good to run */
+
+        pa_assert_se(m = pa_mutex_new(FALSE, FALSE));
+        pa_mutex_lock(m);
+        pa_atomic_ptr_store(&control->mutex, m);
+        return TRUE;
+    }
 
     /* Caveat: We have to make sure that the once func has completed
      * before returning, even if the once func is not actually
      * executed by us. Hence the awkward locking. */
 
-    for (;;) {
-        pa_mutex *m;
-
-        if ((m = pa_atomic_ptr_load(&control->mutex))) {
+    do {
+        m = pa_atomic_ptr_load(&control->mutex);
+    } while (!m);
 
-            /* The mutex is stored in locked state, hence let's just
-             * wait until it is unlocked */
-            pa_mutex_lock(m);
+    /* The mutex is stored in locked state, hence let's just
+     * wait until it is unlocked */
+    pa_mutex_lock(m);
 
-            pa_assert(pa_atomic_load(&control->done));
+    pa_assert(pa_atomic_load(&control->state) == PA_ONCE_STATE_DONE);
 
-            pa_once_end(control);
-            return FALSE;
-        }
-
-        pa_assert_se(m = pa_mutex_new(FALSE, FALSE));
-        pa_mutex_lock(m);
-
-        if (pa_atomic_ptr_cmpxchg(&control->mutex, NULL, m))
-            return TRUE;
-
-        pa_mutex_unlock(m);
-        pa_mutex_free(m);
-    }
+    pa_once_end(control);
+    return FALSE;
 }
 
 void pa_once_end(pa_once *control) {
@@ -71,15 +72,10 @@ void pa_once_end(pa_once *control) {
 
     pa_assert(control);
 
-    pa_atomic_store(&control->done, 1);
+    pa_atomic_store(&control->state, PA_ONCE_STATE_DONE);
 
     pa_assert_se(m = pa_atomic_ptr_load(&control->mutex));
     pa_mutex_unlock(m);
-
-    if (pa_atomic_dec(&control->ref) <= 1) {
-        pa_assert_se(pa_atomic_ptr_cmpxchg(&control->mutex, m, NULL));
-        pa_mutex_free(m);
-    }
 }
 
 /* Not reentrant -- how could it be? */
diff --git a/src/pulsecore/once.h b/src/pulsecore/once.h
index edc8188..eb85f75 100644
--- a/src/pulsecore/once.h
+++ b/src/pulsecore/once.h
@@ -26,14 +26,13 @@
 
 typedef struct pa_once {
     pa_atomic_ptr_t mutex;
-    pa_atomic_t ref, done;
+    pa_atomic_t state;
 } pa_once;
 
 #define PA_ONCE_INIT                                                    \
     {                                                                   \
         .mutex = PA_ATOMIC_PTR_INIT(NULL),                              \
-        .ref = PA_ATOMIC_INIT(0),                                       \
-        .done = PA_ATOMIC_INIT(0)                                       \
+        .state = PA_ATOMIC_INIT(0)                                      \
     }
 
 /* Not to be called directly, use the macros defined below instead */
-- 
1.7.9.5

_______________________________________________
pulseaudio-discuss mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/pulseaudio-discuss

Reply via email to