Hi, I've got only nitpicks for the changelog. Otherwise the patch looks good to me (and yes, without it bw inheritance would be a problem).
On 07/06/16 21:56, Peter Zijlstra wrote: > From: Xunlei Pang <xlp...@redhat.com> > > We should deboost before waking the high-prio task, such that > we don't run two tasks with the same "state"(priority, deadline, ^ space > sched_class, etc) during the period between the end of wake_up_q() > and the end of rt_mutex_adjust_prio(). > > As "Peter Zijlstra" said: > Its semantically icky to have the two tasks running off the same s/Its/It's/ > state and practically icky when you consider bandwidth inheritance -- > where the boosted task wants to explicitly modify the state of the > booster. In that latter case you really want to unboost before you > let the booster run again. > > But this however can lead to prio-inversion if current would get > preempted after the deboost but before waking our high-prio task, > hence we disable preemption before doing deboost, and enabling it s/enabling/re-enable/ > after the wake up is over. > > The patch fixed the logic, and introduced rt_mutex_postunlock() s/The/This/ s/fixed/fixes/ s/introduced/introduces/ > to do some code refactor. > > Most importantly however; this change ensures pointer stability for > the next patch, where we have rt_mutex_setprio() cache a pointer to > the top-most waiter task. If we, as before this change, do the wakeup > first and then deboost, this pointer might point into thin air. > > Cc: Steven Rostedt <rost...@goodmis.org> > Cc: Ingo Molnar <mi...@redhat.com> > Cc: Juri Lelli <juri.le...@arm.com> > Suggested-by: Peter Zijlstra <pet...@infradead.org> > [peterz: Changelog] > Signed-off-by: Xunlei Pang <xlp...@redhat.com> > Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> > Link: > http://lkml.kernel.org/r/1461659449-19497-1-git-send-email-xlp...@redhat.com Do we have any specific tests for this set? I'm running mine. Best, - Juri > --- > > kernel/futex.c | 5 ++--- > kernel/locking/rtmutex.c | 28 ++++++++++++++++++++++++---- > kernel/locking/rtmutex_common.h | 1 + > 3 files changed, 27 insertions(+), 7 deletions(-) > > --- a/kernel/futex.c > +++ b/kernel/futex.c > @@ -1336,9 +1336,8 @@ static int wake_futex_pi(u32 __user *uad > * scheduled away before the wake up can take place. > */ > spin_unlock(&hb->lock); > - wake_up_q(&wake_q); > - if (deboost) > - rt_mutex_adjust_prio(current); > + > + rt_mutex_postunlock(&wake_q, deboost); > > return 0; > } > --- a/kernel/locking/rtmutex.c > +++ b/kernel/locking/rtmutex.c > @@ -1390,12 +1390,32 @@ rt_mutex_fastunlock(struct rt_mutex *loc > } else { > bool deboost = slowfn(lock, &wake_q); > > - wake_up_q(&wake_q); > + rt_mutex_postunlock(&wake_q, deboost); > + } > +} > + > > - /* Undo pi boosting if necessary: */ > - if (deboost) > - rt_mutex_adjust_prio(current); > +/* > + * Undo pi boosting (if necessary) and wake top waiter. > + */ > +void rt_mutex_postunlock(struct wake_q_head *wake_q, bool deboost) > +{ > + /* > + * We should deboost before waking the top waiter task such that > + * we don't run two tasks with the 'same' priority. This however > + * can lead to prio-inversion if we would get preempted after > + * the deboost but before waking our high-prio task, hence the > + * preempt_disable. > + */ > + if (deboost) { > + preempt_disable(); > + rt_mutex_adjust_prio(current); > } > + > + wake_up_q(wake_q); > + > + if (deboost) > + preempt_enable(); > } > > /** > --- a/kernel/locking/rtmutex_common.h > +++ b/kernel/locking/rtmutex_common.h > @@ -111,6 +111,7 @@ extern int rt_mutex_finish_proxy_lock(st > extern int rt_mutex_timed_futex_lock(struct rt_mutex *l, struct > hrtimer_sleeper *to); > extern bool rt_mutex_futex_unlock(struct rt_mutex *lock, > struct wake_q_head *wqh); > +extern void rt_mutex_postunlock(struct wake_q_head *wake_q, bool deboost); > extern void rt_mutex_adjust_prio(struct task_struct *task); > > #ifdef CONFIG_DEBUG_RT_MUTEXES > >