On Sat, 22 Aug 2020 14:32:52 +0200 [email protected] wrote: > On Fri, Aug 21, 2020 at 05:03:34PM -0400, Steven Rostedt wrote: > > > > Sigh. Is it too hard to make mutex_trylock() usable from interrupt > > > context? > > > > > > That's a question for Thomas and Peter Z. > > You should really know that too, the TL;DR answer is it's fundamentally > buggered, can't work.
I knew there was an issue but I couldn't remember the reasoning, and figured you could easily answer it without having to look back at the code. > > The problem is that RT relies on being able to PI boost the mutex owner. > > ISTR we had a thread about all this last year or so, let me see if I can > find that. > > Here goes: > > > https://lkml.kernel.org/r/[email protected] >From this email: > The problem happens when that owner is the idle task, this can happen > when the irq/softirq hits the idle task, in that case the contending > mutex_lock() will try and PI boost the idle task, and that is a big > no-no. What's wrong with priority boosting the idle task? It's not obvious, and I can't find comments in the code saying it would be bad. I looked around the code to see if I could find "why this is bad" but couldn't find it. There's lots of places that say "Do not use mutex_trylock in interrupt context, the implementation is not safe to do so" but I can't find where it says "why" it is not safe to do so. The idle task is not mentioned at all in rtmutex.c and not mentioned in kernel/locking except for some comments about RCU in lockdep. I see that in the idle code the prio_change method does a BUG(), but there's no comment to say why it does so. The commit that added that BUG, doesn't explain why it can't happen: a8941d7ec8167 ("sched: Simplify the idle scheduling class") I may have once known the rationale behind all this, but it's been a long time since I worked on the PI code, and it's out of my cache. -- Steve

