On Tue, Jun 14, 2016 at 11:21:09AM +0100, Juri Lelli wrote: > > [XXX this next section is unparsable] > > Yes, a bit hard to understand. However, am I correct in assuming this > patch and the previous one should fix this problem? Or are there still > other races causing issues?
I think so; so there were two related problems, 1) top_waiter was used outside its serialization 2) a race against the top waiter task and sched_setscheduler() changing its state Now, I could not understand a word of that marked paragraph, but from my understanding of the code both are solved. 1) by keeping the top_pi_task cache updated under pi_lock and rq->lock, thereby ensuring that holding either is sufficient to stabilize it. 2) sched_setscheduler() can change the parameters of the top_pi_task, but since it too holds both pi_lock and rq->lock, it cannot happen at the same time that we're looking at the cached top pi waiter pointer thingy. It can however happen that top_pi_waiter is not in fact the top waiter in a narrow window between sched_setscheduler() changing its parameters and rt_mutex_adjust_pi() re-ordering the PI chain - ending in updating the cached top task pointer thingy.

