Madhavan Venkataraman wrote:
> Team,
> 
> I found a problem in the condition variable code that has created
> a deadlock situation with the new implementation of tickless
> callouts.
> 
> As you all know already, condition variables are used in
> conjunction with a mutex. When a timed wait needs to
> be done, the sequence should be:
> 
>  (mutex is held at this point)
> 
> create a timeout for the wait
> insert the current thread in a sleep queue
> release the mutex
> 
> switch()
> 
> The thread comes out of the switch when
>     - the handler for the timeout fires OR
>     - someone signals and wakes up the thread
> 
> untimeout(timeout id)
> Reacquire mutex
> 
> In two places, the code correctly does the untimeout()
> before acquiring the mutex. In one place, the
> order is reversed.
> 
> The stock kernel does not have a problem as the timeout
> handler does not acquire any mutex. It only does setrun().
> But the stock kernel has another race condition where the
> timeout can fire even before the thread can be inserted into
> the sleepq. The thread would never be woken up.
> 
> To fix the race, I made the timeout handler acquire the mutex
> and release it. But this creates a deadlock with untimeout()
> when the untimeout() races with the timeout handler and waits
> for the handler to finish.
> 
> I have fixed the order in the condition variable code. I will
> let you know the results of the tests I am going to run
> to test this.
> 
> I am letting you know so you don't spend time discovering this
> problem during code review.
> 
> Madhavan

Please let us know when you update the BFUs (and which build you create 
the archives from).

Rafael

Reply via email to