Re: [PATCH] Thread Locks and SMP boxes

William A. Rowe, Jr. 30 Jul 2003 18:34:43 -0000

Explaining the failure pattern;

Once a second thread blocked against the old pthread_unlock
statement, and a second thread finally released that lock, one
of two failure conditions occurred...


1. the original owner had an implicit yield timeslice to the new
   acquirer of the mutex.  That 2nd thread which obtained the
   pthread_mutex_lock would set it's ownership and initialize
   the refcount to one.

   When the original thread regained it's timeslice, it would UNSET
   the new threads ownership and refcount so the mutex appeared
   unowned.

   When the new thread attempted a nested thread lock, it wouldn't
   recognize the mutex owner, so it would deadlock.

2. On a massively parallel (SMP) box, the original thread releasing
   the mutex would not yeild.  The original and new threads would
   both race to unset and set the ownership, respectively.  This created
   a somewhat different race pattern.

Note the use of memset(&mutex, 0, sizeof mutex) further skewed the
behavior by using a very expensive call to unset what is usually a simple
pointer or int.

Because the new patch protects the uninitalization of the mutex while
the lock is still held, the only failure scenario that remains is;

1. thread is interrupted (e.g. signal handler) in between the unsetting
   of the ownership (and decrement of the refcount) and actually releasing
   the mutex.  The interrupt handler attempts to perform a nested lock
   and deadlocks because the ownership has already been reset, but
   the lock is not yet released.

This one remaining failure case is far more unlikely than our currently
possible host of issues.  I don't see a simple workaround to avoid
this last failure case.

Bill

Re: [PATCH] Thread Locks and SMP boxes

Reply via email to