On Apr 18, 2005, at 11:30 PM, Jeffrey Hutzelman wrote:
On Monday, April 18, 2005 10:04:45 PM +0200 Horst Birthelmer <[EMAIL PROTECTED]> wrote:
That's one passage I didn't post in my last postings, which actually started the fire... ;-) I still don't see the confusion. It's sort of what I said in the first place. You still can hold the mutex but miss the broadcast and wait forever there ...
Well, one bit of confusion is that people keep talking about how it doesn't work if pthread_cond_wait is not atomic. That's not a problem, because pthread_cond_wait is NEVER not atomic. It is ALWAYS atomic.
Well, I just adopted that idea to show that not even that would be a race condition and somehow it happens every time. I get held responsible for stuff I didn't meant to say or do ;-)
OK, I reread my postings, maybe I wasn't clear enough in a few places but I wouldn't call that being confused :-)
That's one point the other is, you can be in the critical section with
one thread and broadcasting the others,
which as I pointed out for I have no idea how many times now, is _not_ a
race condition.
Sure you can, but never in a situation where it matters.
Suppose again that thread A is the broadcasting thread, and thread B is the waiter thread that we are interested in.
Now, in the example under discussion, thread A looks like this:
{ ... acquire mutex update queue release mutex cond_broadcast ... }
And thread B looks like this:
acquire mutex while (1) { while (queue is not empty) { pop work from queue release mutex do work acquire mutex } cond_wait }
Note that thread B must release the mutex to do work, but calls cond_wait only if it has observed the queue to be empty since the mutex was last acquired. So, I see about three possible cases:
Case I - Everything happens in the expected order:
Thread A Thread B acquire mutex queue is empty cond_wait -> SLEEP (with release) acquire mutex add item N release mutex cond_broadcast WAKEUP (with acquire) queue is not empty pop item N from queue release mutex process item N acquire mutex queue is empty cond_wait -> SLEEP (with release)
Case II - Not really a deadlock
acquire mutex add item N release mutex acquire mutex queue is not empty pop item N from queue release mutex process item N acquire mutex queue is empty cond_broadcast NO EFFECT cond_wait -> SLEEP (with release)
Case III - Also OK
acquire mutex add item N release mutex acquire mutex queue is not empty pop item N from queue release mutex process item N acquire mutex queue is empty cond_wait -> SLEEP (with release) cond_broadcast WAKEUP (with acquire) queue is empty cond_wait -> SLEEP (with release)
Note that it is possible (as in case II) for the cond_broadcast to have no effect on thread B, because it is not in cond_wait yet. But it is not possible for this to result in item N not being processed, because thread B will never call cond_wait unless it has observed an empty queue since last acquiring the mutex.
There still is this theoretical possibility where the thread will be waiting forever on the cv, but let's put that aside.
Not while there's work in the queue; see above.
So you agree with my initial posting that we didn't really have a problem here and that this is not the cause for those volserver hangs.
Horst
_______________________________________________ OpenAFS-devel mailing list [email protected] https://lists.openafs.org/mailman/listinfo/openafs-devel
