Is that a race condition bug?

Tianwei Wed, 03 Feb 2010 22:43:15 -0800

Hi, httpd developers:
   I am using a
tool(http://code.google.com/p/data-race-test/wiki/ThreadSanitizer) to
study the potential race condition bugs in httpd.
I use httpd-2.2.14, and configure with --mpm=worker. For my first
round of testing, it actually reports lots of warning.  I am checking
one by one manually.  I have analyzed one of the bugs carefully, and
do not know if we should classify it as a benign race or not.
waring description:
1. I enable the USE_ATOMICS_GENERIC where use the hash lock to protect
the apr_atomic_casptr
2. The bug report is:
==5876== WARNING: Possible data race during write of size 8 at 0x429C578: {{{
==5876==    T2 (locks held: {L1}):
==5876==     #0  apr_atomic_casptr
/home/tianwei/apache/httpd-2.2.14/srclib/apr/atomic/unix/mutex.c:184
==5876==     #1  ap_queue_info_set_idle
/home/tianwei/apache/httpd-2.2.14/server/mpm/worker/fdqueue.c:104
==5876==     #2  worker_thread
/home/tianwei/apache/httpd-2.2.14/server/mpm/worker/worker.c:844
==5876==     #3  dummy_worker
/home/tianwei/apache/httpd-2.2.14/srclib/apr/threadproc/unix/thread.c:142
==5876==     #4  ThreadSanitizerStartThread
/home/tianwei/valgrind/drt/tsan/ts_valgrind_intercepts.c:525
==5876==   Concurrent read(s) happened at (OR AFTER) these points:
==5876==    T22 (locks held: {}):
==5876==     #0  ap_queue_info_wait_for_idler
/home/tianwei/apache/httpd-2.2.14/server/mpm/worker/fdqueue.c:209
==5876==     #1  listener_thread
/home/tianwei/apache/httpd-2.2.14/server/mpm/worker/worker.c:642
==5876==     #2  dummy_worker
/home/tianwei/apache/httpd-2.2.14/srclib/apr/threadproc/unix/thread.c:142
==5876==     #3  ThreadSanitizerStartThread
/home/tianwei/valgrind/drt/tsan/ts_valgrind_intercepts.c:525


3. warning analysis.
    listener thread:
           apr_status_t ap_queue_info_wait_for_idler(fd_queue_info_t
*queue_info,
                                          apr_pool_t **recycled_pool)
{

   /* Block if the count of idle workers is zero */
    if (queue_info->idlers == 0) {
        rv = apr_thread_mutex_lock(queue_info->idlers_mutex);
       while (queue_info->idlers == 0) {
            rv = apr_thread_cond_wait(queue_info->wait_for_idler,
                                  queue_info->idlers_mutex);
       }
     rv = apr_thread_mutex_unlock(queue_info->idlers_mutex);
   }

   apr_atomic_dec32(&(queue_info->idlers));

   for (;;) {
      struct recycled_pool *first_pool = queue_info->recycled_pools;
   ....
   }
}

   worker thread:
      apr_status_t ap_queue_info_set_idle(fd_queue_info_t *queue_info,
                                    apr_pool_t *pool_to_recycle)
{
    for (;;) {
            //new_recycle->next = queue_info->recycled_pools;
            struct recycled_pool *next = queue_info->recycled_pools;
            new_recycle->next = next;
            if (apr_atomic_casptr((volatile
void**)&(queue_info->recycled_pools),
                                  new_recycle, next) ==
                next) {
                break;
            }
      }
   /* If this thread just made the idle worker count nonzero,
     * wake up the listener. */
    if (prev_idlers == 0) {
        rv = apr_thread_mutex_lock(queue_info->idlers_mutex);
        if (rv != APR_SUCCESS) {
            return rv;
        }
        rv = apr_thread_cond_signal(queue_info->wait_for_idler);
        if (rv != APR_SUCCESS) {
            apr_thread_mutex_unlock(queue_info->idlers_mutex);
            return rv;
        }
        rv = apr_thread_mutex_unlock(queue_info->idlers_mutex);
        if (rv != APR_SUCCESS) {
            return rv;
        }
    }


Problem:
   if in listener thread, the queue_info->idlers  is "4", where it
will skip the "queue_info->idlers == 0" condition in
ap_queue_info_wait_for_idler, and decrease the idlers atomically, then
it will read "queue_info->recycled_pools". Then in worker thread, it
will write the "queue_info->recycled_pools" in apr_atomic_casptr with
a lock.
We can see that since the listener thread no longer execute the "lock,
cond_wait, unlock", even the worker thread will have a pthread_signal,
there is
no happen-before relation between the read of
"queue_info->recycled_pools" and the write of
"queue_info->recycled_pools". According to the definition, there is a
race condition here.

Question:
   According to analysis above, it seems that there is a race,
however, I am guessing the developer do not treat it as a problem, or
there is other unknown mechanism for me to protect this code? Can you
give me some suggestions?

Thanks.

Tianwei
--
Sheng, Tianwei
Inst. of High Performance Computing
Dept. of Computer Sci. & Tech.
Tsinghua Univ.

Is that a race condition bug?

Reply via email to