Re: race condition destroying condition variables

Brent W. Baccala Tue, 19 Dec 2017 11:21:22 -0800

On Tue, Dec 19, 2017 at 3:25 AM, Samuel Thibault <samuel.thiba...@gnu.org>
 wrote:

> Brent W. Baccala, on mar. 19 déc. 2017 00:08:44 -0500, wrote:
> > Looks like there's a race condition when we destroy a condition
> variable.  My
> > understanding of the expected behavior is that once all the threads have
> been
> > signaled (i.e, pthread_cond_broadcast is called), the condition variable
> can be
> > safely destroyed with pthread_cond_destroy.
>
> Err, I don't think that POSIX allows to assume that. The fact that
> pthread_cond_broadcast has returned doesn't mean that other threads have
> finished with pthread_cond_wait.
>
>
POSIX seems a little unclear on that, but C++ seems to require it.

POSIX:

"It shall be safe to destroy an initialized condition variable upon which
no threads are currently blocked." [1]

and

"The *pthread_cond_broadcast*() function shall unblock all threads
currently blocked on the specified condition variable *cond*." [2]

A little further down on [1], in an "informative" section, is the following
snippet of code:

(A) pthread_cond_broadcast(&ep->notbusy);
    pthread_mutex_unlock(&lp->lm);
(B) pthread_cond_destroy(&rp->notbusy);

...accompanied by the following comment:

"In this example, the condition variable and its list element may be freed
(line B) immediately after all threads waiting for it are awakened (line
A)..."

which is what makes sense to me.  Of course, our pthread_cond_destroy()
does nothing, but assuming that the condvar is now freed in a step (C),
this code won't work reliably in our current implementation.

I found a discussion of this on stackexchange [3], where the top answer
observes that 'The standard only says "shall unblock" and not "on
successful return from this call they will be unblocked".'

The C++ standard, however, is more explicit, in section 30.5.1:

"Only the notification to unblock the wait must happen before destruction".

cppreference.com [4] says:

"It is only safe to invoke the destructor if all threads have been
notified. It is not required that they have exited their respective wait
functions: some threads may still be waiting to reacquire the associated
lock, or may be waiting to be scheduled to run after reacquiring it."

That's the behavior I was counting on.  I'm using C++, and have found that
I can't just notify_all() a condition variable, then destroy it.

And, yes, I've got everything locked so that another thread can't jump in
there with a wait() between those two events.

On Tue, Dec 19, 2017 at 4:17 AM, Richard Braun <rbr...@sceen.net> wrote:

Besides, the threads should also all go through reacquiring the associated
> mutex, usually sitting right next to the condition variable, and usually
> both embedded in a parent object. What you're normally really interested
> in is releasing this parent object, including destroying the mutex, which
> means you also have to wait for all threads to unlock it. One common way
> to deal with this is reference counters on the parent object.

In my case, I've got a lot more condition variables than mutexes, because I
don't want threads waking up for events other than the one they're waiting
for.  One mutex for the entire pager, and a condition variable for every
outstanding lock and write.  So, for example, if a thread is waiting for a
particular write to finish, it waits on that write's condition variable,
which gets signaled and destroyed when the write finishes, while the pager
object and the mutex continue to exist.

I was thinking about wrapping such a counter as you suggest into the
condition variable structure, to ensure that the POSIX informative behavior
and the C++ behavior work reliably.

Increment the counter (atomically) when we're about to block on a condition
variable, and decrement it when we're done the post-block processing.  Then
pthread_cond_destroy() can first check to make sure that the wait queue is
empty (return EBUSY if not), then spin wait for the counter to be zero
otherwise.  I think that should ensure that we can free() the condition
variable after pthread_cond_destroy() has returned, and that, in turn, will
ensure that the C++ destructor works reliably, too.

    agape
    brent

[1]
http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_cond_destroy.html
[2]
http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_cond_broadcast.html
[3]
https://stackoverflow.com/questions/7598457/when-can-a-cond-var-be-used-to-synchronize-its-own-destruction-unmapping
[4]
http://en.cppreference.com/w/cpp/thread/condition_variable/~condition_variable

Re: race condition destroying condition variables

Reply via email to