We found a severe bug in gwlib.

We have the following scenario:

A  calls debug("xxx",0,"xxxx") which does :

                gwlist_add_producer(writers);

        and continues but doesnt reach yet this line:

                gwlist_remove_producer(writers);


at this point the list "writers" is empty but has writers- >num_producers=1

B does:
            lock(list); /* atomic lock */

        list->single_operation_lock->owner = -1;
pthread_cond_wait(&list->nonempty, &list- >single_operation_lock->mutex);

so it waits until A is calling gwlist_remove_producer()

and wait until A completes.


Now A is calling this:

void gwlist_remove_producer(List *list)
{
    lock(list);
    gw_assert(list->num_producers > 0);
    --list->num_producers;
    pthread_cond_broadcast(&list->nonempty);
    unlock(list);
}

and gets locked up because the list's atomic lock is locked by B.


C now has a new debug message and gets stopped at  gwlist_produce().


In other words, every process who wants to write to debug log gets stuck.

Now there is different solutions to this.
Our approach would be to do in gwlist_consume() to do this:


                unlock(list);
pthread_cond_wait(&list->nonempty, &list- >single_operation_lock->mutex);
        lock(list);


Any other ideas?
maybe no atomic lock around gwlist_remove_producer() ?


Andreas Fink

Fink Consulting GmbH
Global Networks Schweiz AG
BebbiCell AG

---------------------------------------------------------------
Tel: +41-61-6666330 Fax: +41-61-6666331  Mobile: +41-79-2457333
Address: Clarastrasse 3, 4058 Basel, Switzerland
E-Mail:  [EMAIL PROTECTED]
www.finkconsulting.com www.global-networks.ch www.bebbicell.ch
---------------------------------------------------------------
ICQ: 8239353 MSN: [EMAIL PROTECTED] AIM: smsrelay Skype: andreasfink
Yahoo: finkconsulting SMS: +41792457333





Reply via email to