Madhavan Venkataraman wrote: > Team, > > I found a problem in the condition variable code that has created > a deadlock situation with the new implementation of tickless > callouts. > > As you all know already, condition variables are used in > conjunction with a mutex. When a timed wait needs to > be done, the sequence should be: > > (mutex is held at this point) > > create a timeout for the wait > insert the current thread in a sleep queue > release the mutex > > switch() > > The thread comes out of the switch when > - the handler for the timeout fires OR > - someone signals and wakes up the thread > > untimeout(timeout id) > Reacquire mutex > > In two places, the code correctly does the untimeout() > before acquiring the mutex. In one place, the > order is reversed. > > The stock kernel does not have a problem as the timeout > handler does not acquire any mutex. It only does setrun(). > But the stock kernel has another race condition where the > timeout can fire even before the thread can be inserted into > the sleepq. The thread would never be woken up. > > To fix the race, I made the timeout handler acquire the mutex > and release it. But this creates a deadlock with untimeout() > when the untimeout() races with the timeout handler and waits > for the handler to finish. > > I have fixed the order in the condition variable code. I will > let you know the results of the tests I am going to run > to test this. > > I am letting you know so you don't spend time discovering this > problem during code review. > > Madhavan
Please let us know when you update the BFUs (and which build you create the archives from). Rafael
