Brian Pane wrote: > Thanks for catching that. In this particular design, I think > we'll be safe because only the current listener thread is ever > allowed to pop from the stack.
yeah, that occurred to me driving home yesterday. If we can guarantee there is only one popper at a time, that problem can't happen. But if we want a more general apr_atomic_pop which is safe to use with multiple simultaneous pops, we'll have to fix it. > But the workaround that you > describe below will help solve another problem in the design: > I like this technique of CAS'ing two variables at once. In the > leader/follower algorithm that I originally posted, there is a > race condition that can occur when all the workers are busy and > one of them finishes handling its connection at about the same > time that the current listener accepts a new connection: > > worker becoming idle: listener: > atomically checks current_listener; waiting for connection > finds that there's already a | > listener and decides to push | > self onto idle_workers stack | > [context switch happens here] v > | [connection arrives] > | | > | v > | atomically checks the stack; > | but the stack is empty, so > | it sets > current_listener=NULL > | | > | v > | [processing the request] > v > [thread wakes up here] > | > v > atomically pushes self onto stack > > In this scenario, we end up with no listener, but there's > an idle thread in the stack that ought to have become the > new listener. The problem is that the current_listener > update and the stack push need to happen atomically. > > We could solve this problem by combining the current_listener > pointer and the top-of-stack pointer into a single 32-bit > int so that threads can do a CAS on both at once. Both of > these pointers can be replaced by thread IDs, so we can > pack them both into a 32-bit value as long as there are > fewer than 2^16 worker threads. That should work. I've also seen this type of problem solved with the following: * Check the other guy's stack * If empty, add myself to my stack * Check the other guy's stack again * If it's no longer empty, back out and start over. Both the producer and consumer have to do this. In this case, think of current_listener as being sort of a one deep stack. I hope we will be able to do 64-bit CAS on most 32-bit platforms. That would simplify things greatly. Greg