The "proper" (per my experience) way to use a condition variable in the pthreads world is a squence like:
Waiter Notifier
---------- -------------
mutex_lock
cond_wait
(pthreads atomically surrenders lock and waits)
mutex_lock
cond_signal
mutex_unlock
(pthread atomically requires the mutex and wakes up)
mutex_unlock
But if the waiter code is for instance in a library and is called with the nested mutex already locked the lock/cond_wait/unlock cycle always clears the mutex. So a cycle like:
Waiter Notifier
---------- -------------
mutex_lock
mutex_lock
cond_wait
(pthreads atomically surrenders lock and waits)
mutex_lock
cond_signal
mutex_unlock
(pthread atomically requires the mutex and wakes up)
mutex_unlock
mutex_unlock
is broken - Details follow - what happens is the mutex_lock/unlock cycle in the notifier trashes the APR portions of the mutex.
APR_DECLARE(apr_status_t) apr_thread_cond_wait(apr_thread_cond_t *cond,The owner and owner_ref fields of the APR mutex structure are left alone. The caller does a mutex_lock which works and destroys the owner/owner_ref fields:
apr_thread_mutex_t *mutex)
{
apr_status_t rv;
rv = pthread_cond_wait(cond->cond, &mutex->mutex);
#ifdef PTHREAD_SETS_ERRNO
if (rv) {
rv = errno;
}
#endif
}
APR_DECLARE(apr_status_t) apr_thread_mutex_lock(apr_thread_mutex_t *mutex)When the notifier than calls mutex_unlock() the pthread mutex is released and the mutex owner/owner_ref fields zeroed out.
{
apr_status_t rv;
if (mutex->nested) {
---->This doesn't hit because owner is the other waiting thread:
if (apr_os_thread_equal(mutex->owner, apr_os_thread_current())) {
mutex->owner_ref++;
return APR_SUCCESS;
}
--->So we succeed in "stealing" the mutex
rv = pthread_mutex_lock(&mutex->mutex);
if (rv) {
#ifdef PTHREAD_SETS_ERRNO
rv = errno;
#endif
return rv;
}
mutex->owner = apr_os_thread_current();
mutex->owner_ref = 1;
return rv;
}
else {
rv = pthread_mutex_lock(&mutex->mutex);
#ifdef PTHREAD_SETS_ERRNO
if (rv) {
rv = errno;
}
#endif
return rv;
}
}
So now the waiter thread gets the pthread mutex inside pthread_cond_wait() and wakes up. The call returns to the caller. The caller then trys to release the mutex - this call succeeds because we always release the mutex. But the mutex is now fully released - so if the caller is counting on holding the mutex with nesting he's hosed.
APR_DECLARE(apr_status_t) apr_thread_mutex_unlock(apr_thread_mutex_t *mutex)The fix seems to be to save and restore the owner/owner_ref fields around the pthread_cond_wait() call. The same thing applies to the timed version of the wait but I used the shorter version here for clarity. P.S.. - I'm not subscribed to the dev list so please CC me on replies, thanks.
{
apr_status_t status;
if (mutex->nested) {
---->This fails so it just releases the pthread_mutex (incorrectly if my original nest count was > 1)
if (apr_os_thread_equal(mutex->owner, apr_os_thread_current())) {
mutex->owner_ref--;
if (mutex->owner_ref > 0)
return APR_SUCCESS;
}
status = pthread_mutex_unlock(&mutex->mutex);
if (status) {
#ifdef PTHREAD_SETS_ERRNO
status = errno;
#endif
return status;
}
memset(&mutex->owner, 0, sizeof mutex->owner);
mutex->owner_ref = 0;
return status;
}
else {
status = pthread_mutex_unlock(&mutex->mutex);
#ifdef PTHREAD_SETS_ERRNO
if (status) {
status = errno;
}
#endif
return status;
}
}
--
Mark Gooderum
Vernier Networks, Inc.
[EMAIL PROTECTED]