new_thread

Andreas Fink Sat, 31 Mar 2007 12:45:48 -0800

Hi list,

I've spotted an issue in gwlib whe using in our own application whichpuzzles me.


The start is this error message:

2007-03-31 14:48:14 [510] [46] INFO: SMPP: Accepted connection from:xx.xx.xx.xx2007-03-31 14:48:14 [510] [-1] PANIC: /Users/afink/development/gwlib/src/gwlib/thread.c:142: mutex_lock_real: Managed to lock the mutextwice! (Called from /Users/afink/development/gwlib/src/gwlib/list.c:334:gwlist_lock.)


so a gwlist_lock is screwed up. Sounds simple to fix but its not.

mutex_lock does this:

    ret = pthread_mutex_lock(&mutex->mutex);
    if (ret != 0)

panic(0, "%s:%ld: %s: Mutex failure! (Called from %s:%ld:%s.)", \__FILE__, (long) __LINE__, __func__, file, (long) line,func);

    if (mutex->owner == gwthread_self())

panic(0, "%s:%ld: %s: Managed to lock the mutex twice!(Called from %s:%ld:%s.)", \__FILE__, (long) __LINE__, __func__, file, (long) line,func);

if the mutex is already locked, the first panic is called. Thatmessage has not appeared in the log. So we can assume the mutex hadbeen successfully locked and this is not the issue.

Then it checks if the owner is ourself (to disallow double locking ifyou are the owner).

If the owner is us, then we panic. This is what's happening here.

The panic shows a thread id of -1. This means we are in thread -1which is invalid. If the mutex is not owned by anyone,mutex->owner is set to -1. Thats the clash we see. So the error comesfrom the fact that gwthread_self returns -1:


/* Return the thread id of this thread. */
long gwthread_self(void)
{
    struct threadinfo *threadinfo;
        
    threadinfo = pthread_getspecific(tsd_key);
    if (threadinfo)
        return threadinfo->number;
    else
        return -1;
}

It does return -1 when pthread_getspecific(tsd_key) returns NULL.This occurs when the specific thread has not received its value yet.of the thread has not completed yet. This is done in new_thread()only. So it must panic before the thread has completed its startup.

I suspect a race condition in static void *new_thread(void *arg) ingwthread_pthread.c. note this is a multi CPU situation.


This is what is in my code triggering this:

info(0, "SMPP: Accepted connection from: %s", octstr_get_cstr(pc->remote_host));

                gwthread_create(SMPP_Handler_Thread,(void*)pc);


gwthread_create calls spawn_thread to do the job.

spawn_thread should say "Started thread %ld (%s)" or "Failed to startthread (%s)" at its end. But we don't see this!This error happens before those outputs so we must have a racecondition here in spawn_thread / new_thread.The error must be created out from new_thread as otherwise thegwthread_self call would return its id.

spawn_thread allocates a memory structure to pass the parameters tothe new thread. Then it calls pthread_create.


Lets look at new thread:

static void *new_thread(void *arg)
{
    int ret;
    struct new_thread_args *p = arg;

    /* Make sure we don't start until our parent has entered
     * our thread info in the thread table. */
    lock();

well here (before we actually set pthread_getspecific(tsd_key)) welock the global thread list to synch with the main thread. (note theinline function lock is calling pthread_mutex_lock, not mutex_lock).So new thread would wait until spawn_thread has filled all fields andthen unlocks() and calls pthread_setspecific() right after that.


So basically there is no way of getting into this situation...


Does anyone can see how this scenario could occur?
Its rare but severe. Happens on a dual CPU machine under MacOS X.

Andreas Fink

Fink Consulting GmbH
Global Networks Schweiz AG
BebbiCell AG

---------------------------------------------------------------
Tel: +41-61-6666330 Fax: +41-61-6666331  Mobile: +41-79-2457333
Address: Clarastrasse 3, 4058 Basel, Switzerland
E-Mail:  [EMAIL PROTECTED]
www.finkconsulting.com www.global-networks.ch www.bebbicell.ch
---------------------------------------------------------------
ICQ: 8239353 MSN: [EMAIL PROTECTED] AIM: smsrelay Skype: andreasfink
Yahoo: finkconsulting SMS: +41792457333

race issue in gwthread_create / spawn_thread / new_thread

Reply via email to