Re: [OpenAFS-devel] volserver hangs possible fix

Jeffrey Hutzelman Tue, 12 Apr 2005 09:05:10 -0700

On Tuesday, April 12, 2005 11:11:29 AM -0400 Ted Anderson <[EMAIL PROTECTED]> wrote:

Just to make sure I understand this problem, let me try providing a
specific scenario:
     thread A        thread B
     --------        --------
  1:                 lock M;
  2:                 check queue;
  3:                 cond_wait(C,M);     // queue is empty
  4:                  -> kernel_scheduler {
  5:                         unlock M;
  6: lock M;                 ...
  7: add item;               ...
  8: unlock M;               ...
  9: broadcast on C;         ...         // noop, nobody blocked on C
10:                         put B on C;
11:                         desched B;
12:          DEADLOCK

Is this a problem that can occur with a corrently operating
implementation of POSIX threads, i.e. an example of correct but
non-deterministic scheduling?  I've expanded the operation of cond_wait
as I understand it.  I observe that M correctly protects the queue
itself.  Treating "broadcast C" and "put B on C" as black boxes, however,
it doesn't seem to help to move broadcast(C) inside M, i.e. by swapping 8
& 9.  On the other hand, tt would seem a simple matter to reverse the
sequence of 5 & 10, but perhaps that is inconvenient for some
multi-processor schedulers.

As I understand your comment, you are saying that the broadcast must be
protected by the mutex or race conditions are possible.  What am I
missing here?

What you're missing here is that cond_wait() is atomic. The lines you list as 5 and 10 happen at the same time, so it's not possible for thread A to acquire the lock before thread B is on the wait-list for the CV. This is why it is necessary to hold the lock through the broadcast in order to prevent a race -- otherwise, the broadcast could indeed occur between lines 2 and 10, and thread B would miss the wakeup.

-- Jeffrey T. Hutzelman (N3NHS) <[EMAIL PROTECTED]>
  Sr. Research Systems Programmer
  School of Computer Science - Research Computing Facility
  Carnegie Mellon University - Pittsburgh, PA

_______________________________________________
OpenAFS-devel mailing list
[email protected]
https://lists.openafs.org/mailman/listinfo/openafs-devel

Re: [OpenAFS-devel] volserver hangs possible fix

Reply via email to