We found a severe bug in gwlib.
We have the following scenario:
A calls debug("xxx",0,"xxxx") which does :
gwlist_add_producer(writers);
and continues but doesnt reach yet this line:
gwlist_remove_producer(writers);
at this point the list "writers" is empty but has writers-
>num_producers=1
B does:
lock(list); /* atomic lock */
list->single_operation_lock->owner = -1;
pthread_cond_wait(&list->nonempty, &list-
>single_operation_lock->mutex);
so it waits until A is calling gwlist_remove_producer()
and wait until A completes.
Now A is calling this:
void gwlist_remove_producer(List *list)
{
lock(list);
gw_assert(list->num_producers > 0);
--list->num_producers;
pthread_cond_broadcast(&list->nonempty);
unlock(list);
}
and gets locked up because the list's atomic lock is locked by B.
C now has a new debug message and gets stopped at gwlist_produce().
In other words, every process who wants to write to debug log gets
stuck.
Now there is different solutions to this.
Our approach would be to do in gwlist_consume() to do this:
unlock(list);
pthread_cond_wait(&list->nonempty, &list-
>single_operation_lock->mutex);
lock(list);
Any other ideas?
maybe no atomic lock around gwlist_remove_producer() ?
Andreas Fink
Fink Consulting GmbH
Global Networks Schweiz AG
BebbiCell AG
---------------------------------------------------------------
Tel: +41-61-6666330 Fax: +41-61-6666331 Mobile: +41-79-2457333
Address: Clarastrasse 3, 4058 Basel, Switzerland
E-Mail: [EMAIL PROTECTED]
www.finkconsulting.com www.global-networks.ch www.bebbicell.ch
---------------------------------------------------------------
ICQ: 8239353 MSN: [EMAIL PROTECTED] AIM: smsrelay Skype: andreasfink
Yahoo: finkconsulting SMS: +41792457333