Re: [PATCH] backport: sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2018-11-06 Thread Henrik Austad
On Tue, Nov 06, 2018 at 02:22:10PM +0100, Peter Zijlstra wrote:
> On Tue, Nov 06, 2018 at 01:47:21PM +0100, Henrik Austad wrote:
> > From: Xunlei Pang 
> > 
> > On some of our systems, we notice this error popping up on occasion,
> > completely hanging the system.
> > 
> >[] enqueue_task_dl+0x1f0/0x420
> >[] activate_task+0x7c/0x90
> >[] push_dl_task+0x164/0x1c8
> >[] push_dl_tasks+0x20/0x30
> >[] __balance_callback+0x44/0x68
> >[] __schedule+0x6f0/0x728
> >[] schedule+0x78/0x98
> >[] __rt_mutex_slowlock+0x9c/0x108
> >[] rt_mutex_slowlock+0xd8/0x198
> >[] rt_mutex_timed_futex_lock+0x30/0x40
> >[] futex_lock_pi+0x200/0x3b0
> >[] do_futex+0x1c4/0x550
> > 
> > It runs an 4.4 kernel on an arm64 rig. The signature looks suspciously
> > similar to what Xuneli Pang observed in his crash, and with this fix, my
> > issue goes away (my system has survivied approx 1500 reboots and a few
> > nasty tests so far)
> > 
> > Alongside this patch in the tree, there are a few other bits and pieces
> > pertaining to futex, rtmutex and kernel/sched/, but those patches
> > creates
> > weird crashes that I have not been able to dissect yet. Once (if) I have
> > been able to figure those out (and test), they will be sent later.
> > 
> > I am sure other users of LTS that also use sched_deadline will run into
> > this issue, so I think it is a good candidate for 4.4-stable. Possibly
> > also
> > to 4.9 and 4.14, but I have not had time to test for those versions.
> 
> But this patch relies on:
> 
>   2a1c60299406 ("rtmutex: Deboost before waking up the top waiter")

Yes, I have that one in my other queue (that crashes)

> for pointer stability, but that patch in turn relies on the whole
> FUTEX_UNLOCK_PI patch set:
> 
>  $ git log --oneline 
> 499f5aca2cdd5e958b27e2655e7e7f82524f46b1..56222b212e8edb1cf51f5dd73ff645809b082b40
> 
>   56222b212e8e futex: Drop hb->lock before enqueueing on the rtmutex
>   bebe5b514345 futex: Futex_unlock_pi() determinism
>   cfafcd117da0 futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()
>   38d589f2fd08 futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()
>   50809358dd71 futex,rt_mutex: Introduce rt_mutex_init_waiter()
>   16ffa12d7425 futex: Pull rt_mutex_futex_unlock() out from under hb->lock
>   73d786bd043e futex: Rework inconsistent rt_mutex/futex_q state
>   bf92cf3a5100 futex: Cleanup refcounting
>   734009e96d19 futex: Change locking rules
>   5293c2efda37 futex,rt_mutex: Provide futex specific rt_mutex API
>   fffa954fb528 futex: Remove rt_mutex_deadlock_account_*()
>   1b367ece0d7e futex: Use smp_store_release() in mark_wake_futex()
> 
> and possibly some follow-up fixes on that (I have vague memories of
> that).

ok, so this looks a bit like the queue I have, thanks!

> As is, just the one patch you propose isn't correct :/
> 
> Yes, that was a ginormous amount of work to fix a seemingly simple splat
> :-(

Yep, well, on the positive side, I now know that I have to figure out the 
crashes, which is useful knowledge! Thanks!

I'll hammer away at the full series of backports for this then and resend 
once I've hammered out the issues.

Thanks for the feedback, much appreciated!

-- 
Henrik Austad
CVTG Eng - Endpoints
Cisco Systems Norway


Re: [PATCH] backport: sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2018-11-06 Thread Henrik Austad
On Tue, Nov 06, 2018 at 02:22:10PM +0100, Peter Zijlstra wrote:
> On Tue, Nov 06, 2018 at 01:47:21PM +0100, Henrik Austad wrote:
> > From: Xunlei Pang 
> > 
> > On some of our systems, we notice this error popping up on occasion,
> > completely hanging the system.
> > 
> >[] enqueue_task_dl+0x1f0/0x420
> >[] activate_task+0x7c/0x90
> >[] push_dl_task+0x164/0x1c8
> >[] push_dl_tasks+0x20/0x30
> >[] __balance_callback+0x44/0x68
> >[] __schedule+0x6f0/0x728
> >[] schedule+0x78/0x98
> >[] __rt_mutex_slowlock+0x9c/0x108
> >[] rt_mutex_slowlock+0xd8/0x198
> >[] rt_mutex_timed_futex_lock+0x30/0x40
> >[] futex_lock_pi+0x200/0x3b0
> >[] do_futex+0x1c4/0x550
> > 
> > It runs an 4.4 kernel on an arm64 rig. The signature looks suspciously
> > similar to what Xuneli Pang observed in his crash, and with this fix, my
> > issue goes away (my system has survivied approx 1500 reboots and a few
> > nasty tests so far)
> > 
> > Alongside this patch in the tree, there are a few other bits and pieces
> > pertaining to futex, rtmutex and kernel/sched/, but those patches
> > creates
> > weird crashes that I have not been able to dissect yet. Once (if) I have
> > been able to figure those out (and test), they will be sent later.
> > 
> > I am sure other users of LTS that also use sched_deadline will run into
> > this issue, so I think it is a good candidate for 4.4-stable. Possibly
> > also
> > to 4.9 and 4.14, but I have not had time to test for those versions.
> 
> But this patch relies on:
> 
>   2a1c60299406 ("rtmutex: Deboost before waking up the top waiter")

Yes, I have that one in my other queue (that crashes)

> for pointer stability, but that patch in turn relies on the whole
> FUTEX_UNLOCK_PI patch set:
> 
>  $ git log --oneline 
> 499f5aca2cdd5e958b27e2655e7e7f82524f46b1..56222b212e8edb1cf51f5dd73ff645809b082b40
> 
>   56222b212e8e futex: Drop hb->lock before enqueueing on the rtmutex
>   bebe5b514345 futex: Futex_unlock_pi() determinism
>   cfafcd117da0 futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()
>   38d589f2fd08 futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()
>   50809358dd71 futex,rt_mutex: Introduce rt_mutex_init_waiter()
>   16ffa12d7425 futex: Pull rt_mutex_futex_unlock() out from under hb->lock
>   73d786bd043e futex: Rework inconsistent rt_mutex/futex_q state
>   bf92cf3a5100 futex: Cleanup refcounting
>   734009e96d19 futex: Change locking rules
>   5293c2efda37 futex,rt_mutex: Provide futex specific rt_mutex API
>   fffa954fb528 futex: Remove rt_mutex_deadlock_account_*()
>   1b367ece0d7e futex: Use smp_store_release() in mark_wake_futex()
> 
> and possibly some follow-up fixes on that (I have vague memories of
> that).

ok, so this looks a bit like the queue I have, thanks!

> As is, just the one patch you propose isn't correct :/
> 
> Yes, that was a ginormous amount of work to fix a seemingly simple splat
> :-(

Yep, well, on the positive side, I now know that I have to figure out the 
crashes, which is useful knowledge! Thanks!

I'll hammer away at the full series of backports for this then and resend 
once I've hammered out the issues.

Thanks for the feedback, much appreciated!

-- 
Henrik Austad
CVTG Eng - Endpoints
Cisco Systems Norway


Re: [PATCH] backport: sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2018-11-06 Thread Peter Zijlstra
On Tue, Nov 06, 2018 at 01:47:21PM +0100, Henrik Austad wrote:
> From: Xunlei Pang 
> 
> On some of our systems, we notice this error popping up on occasion,
> completely hanging the system.
> 
>[] enqueue_task_dl+0x1f0/0x420
>[] activate_task+0x7c/0x90
>[] push_dl_task+0x164/0x1c8
>[] push_dl_tasks+0x20/0x30
>[] __balance_callback+0x44/0x68
>[] __schedule+0x6f0/0x728
>[] schedule+0x78/0x98
>[] __rt_mutex_slowlock+0x9c/0x108
>[] rt_mutex_slowlock+0xd8/0x198
>[] rt_mutex_timed_futex_lock+0x30/0x40
>[] futex_lock_pi+0x200/0x3b0
>[] do_futex+0x1c4/0x550
> 
> It runs an 4.4 kernel on an arm64 rig. The signature looks suspciously
> similar to what Xuneli Pang observed in his crash, and with this fix, my
> issue goes away (my system has survivied approx 1500 reboots and a few
> nasty tests so far)
> 
> Alongside this patch in the tree, there are a few other bits and pieces
> pertaining to futex, rtmutex and kernel/sched/, but those patches
> creates
> weird crashes that I have not been able to dissect yet. Once (if) I have
> been able to figure those out (and test), they will be sent later.
> 
> I am sure other users of LTS that also use sched_deadline will run into
> this issue, so I think it is a good candidate for 4.4-stable. Possibly
> also
> to 4.9 and 4.14, but I have not had time to test for those versions.

But this patch relies on:

  2a1c60299406 ("rtmutex: Deboost before waking up the top waiter")

for pointer stability, but that patch in turn relies on the whole
FUTEX_UNLOCK_PI patch set:

 $ git log --oneline 
499f5aca2cdd5e958b27e2655e7e7f82524f46b1..56222b212e8edb1cf51f5dd73ff645809b082b40

  56222b212e8e futex: Drop hb->lock before enqueueing on the rtmutex
  bebe5b514345 futex: Futex_unlock_pi() determinism
  cfafcd117da0 futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()
  38d589f2fd08 futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()
  50809358dd71 futex,rt_mutex: Introduce rt_mutex_init_waiter()
  16ffa12d7425 futex: Pull rt_mutex_futex_unlock() out from under hb->lock
  73d786bd043e futex: Rework inconsistent rt_mutex/futex_q state
  bf92cf3a5100 futex: Cleanup refcounting
  734009e96d19 futex: Change locking rules
  5293c2efda37 futex,rt_mutex: Provide futex specific rt_mutex API
  fffa954fb528 futex: Remove rt_mutex_deadlock_account_*()
  1b367ece0d7e futex: Use smp_store_release() in mark_wake_futex()

and possibly some follow-up fixes on that (I have vague memories of
that).

As is, just the one patch you propose isn't correct :/

Yes, that was a ginormous amount of work to fix a seemingly simple splat
:-(


Re: [PATCH] backport: sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2018-11-06 Thread Peter Zijlstra
On Tue, Nov 06, 2018 at 01:47:21PM +0100, Henrik Austad wrote:
> From: Xunlei Pang 
> 
> On some of our systems, we notice this error popping up on occasion,
> completely hanging the system.
> 
>[] enqueue_task_dl+0x1f0/0x420
>[] activate_task+0x7c/0x90
>[] push_dl_task+0x164/0x1c8
>[] push_dl_tasks+0x20/0x30
>[] __balance_callback+0x44/0x68
>[] __schedule+0x6f0/0x728
>[] schedule+0x78/0x98
>[] __rt_mutex_slowlock+0x9c/0x108
>[] rt_mutex_slowlock+0xd8/0x198
>[] rt_mutex_timed_futex_lock+0x30/0x40
>[] futex_lock_pi+0x200/0x3b0
>[] do_futex+0x1c4/0x550
> 
> It runs an 4.4 kernel on an arm64 rig. The signature looks suspciously
> similar to what Xuneli Pang observed in his crash, and with this fix, my
> issue goes away (my system has survivied approx 1500 reboots and a few
> nasty tests so far)
> 
> Alongside this patch in the tree, there are a few other bits and pieces
> pertaining to futex, rtmutex and kernel/sched/, but those patches
> creates
> weird crashes that I have not been able to dissect yet. Once (if) I have
> been able to figure those out (and test), they will be sent later.
> 
> I am sure other users of LTS that also use sched_deadline will run into
> this issue, so I think it is a good candidate for 4.4-stable. Possibly
> also
> to 4.9 and 4.14, but I have not had time to test for those versions.

But this patch relies on:

  2a1c60299406 ("rtmutex: Deboost before waking up the top waiter")

for pointer stability, but that patch in turn relies on the whole
FUTEX_UNLOCK_PI patch set:

 $ git log --oneline 
499f5aca2cdd5e958b27e2655e7e7f82524f46b1..56222b212e8edb1cf51f5dd73ff645809b082b40

  56222b212e8e futex: Drop hb->lock before enqueueing on the rtmutex
  bebe5b514345 futex: Futex_unlock_pi() determinism
  cfafcd117da0 futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock()
  38d589f2fd08 futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()
  50809358dd71 futex,rt_mutex: Introduce rt_mutex_init_waiter()
  16ffa12d7425 futex: Pull rt_mutex_futex_unlock() out from under hb->lock
  73d786bd043e futex: Rework inconsistent rt_mutex/futex_q state
  bf92cf3a5100 futex: Cleanup refcounting
  734009e96d19 futex: Change locking rules
  5293c2efda37 futex,rt_mutex: Provide futex specific rt_mutex API
  fffa954fb528 futex: Remove rt_mutex_deadlock_account_*()
  1b367ece0d7e futex: Use smp_store_release() in mark_wake_futex()

and possibly some follow-up fixes on that (I have vague memories of
that).

As is, just the one patch you propose isn't correct :/

Yes, that was a ginormous amount of work to fix a seemingly simple splat
:-(


[PATCH] backport: sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2018-11-06 Thread Henrik Austad
From: Xunlei Pang 

On some of our systems, we notice this error popping up on occasion,
completely hanging the system.

   [] enqueue_task_dl+0x1f0/0x420
   [] activate_task+0x7c/0x90
   [] push_dl_task+0x164/0x1c8
   [] push_dl_tasks+0x20/0x30
   [] __balance_callback+0x44/0x68
   [] __schedule+0x6f0/0x728
   [] schedule+0x78/0x98
   [] __rt_mutex_slowlock+0x9c/0x108
   [] rt_mutex_slowlock+0xd8/0x198
   [] rt_mutex_timed_futex_lock+0x30/0x40
   [] futex_lock_pi+0x200/0x3b0
   [] do_futex+0x1c4/0x550

It runs an 4.4 kernel on an arm64 rig. The signature looks suspciously
similar to what Xuneli Pang observed in his crash, and with this fix, my
issue goes away (my system has survivied approx 1500 reboots and a few
nasty tests so far)

Alongside this patch in the tree, there are a few other bits and pieces
pertaining to futex, rtmutex and kernel/sched/, but those patches
creates
weird crashes that I have not been able to dissect yet. Once (if) I have
been able to figure those out (and test), they will be sent later.

I am sure other users of LTS that also use sched_deadline will run into
this issue, so I think it is a good candidate for 4.4-stable. Possibly
also
to 4.9 and 4.14, but I have not had time to test for those versions.

Apart from a minor conflict in sched.h, the patch applied cleanly.

(Tested on arm64 running 4.4.)

-Henrik

A crash happened while I was playing with deadline PI rtmutex.

BUG: unable to handle kernel NULL pointer dereference at
0018
IP: [] rt_mutex_get_top_task+0x1f/0x30
PGD 232a75067 PUD 230947067 PMD 0
Oops:  [#1] SMP
CPU: 1 PID: 10994 Comm: a.out Not tainted

Call Trace:
[] enqueue_task+0x2c/0x80
[] activate_task+0x23/0x30
[] pull_dl_task+0x1d5/0x260
[] pre_schedule_dl+0x16/0x20
[] __schedule+0xd3/0x900
[] schedule+0x29/0x70
[] __rt_mutex_slowlock+0x4b/0xc0
[] rt_mutex_slowlock+0xd1/0x190
[] rt_mutex_timed_lock+0x53/0x60
[] futex_lock_pi.isra.18+0x28c/0x390
[] do_futex+0x190/0x5b0
[] SyS_futex+0x80/0x180

This is because rt_mutex_enqueue_pi() and rt_mutex_dequeue_pi()
are only protected by pi_lock when operating pi waiters, while
rt_mutex_get_top_task(), will access them with rq lock held but
not holding pi_lock.

In order to tackle it, we introduce new "pi_top_task" pointer
cached in task_struct, and add new rt_mutex_update_top_task()
to update its value, it can be called by rt_mutex_setprio()
which held both owner's pi_lock and rq lock. Thus "pi_top_task"
can be safely accessed by enqueue_task_dl() under rq lock.

Originally-From: Peter Zijlstra 
Signed-off-by: Xunlei Pang 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Steven Rostedt 
Reviewed-by: Thomas Gleixner 
Cc: juri.le...@arm.com
Cc: bige...@linutronix.de
Cc: mathieu.desnoy...@efficios.com
Cc: jdesfos...@efficios.com
Cc: bris...@redhat.com
Link: http://lkml.kernel.org/r/20170323150216.157682...@infradead.org
Signed-off-by: Thomas Gleixner 

(cherry picked from commit e96a7705e7d3fef96aec9b590c63b2f6f7d2ba22)

Conflicts:
include/linux/sched.h

Backported-and-tested-by: Henrik Austad 
Cc: Greg Kroah-Hartman 
---
 include/linux/init_task.h |  1 +
 include/linux/sched.h |  2 ++
 include/linux/sched/rt.h  |  1 +
 kernel/fork.c |  1 +
 kernel/locking/rtmutex.c  | 29 +
 kernel/sched/core.c   |  2 ++
 6 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 1c1ff7e4faa4..a561ce0c5d7f 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -162,6 +162,7 @@ extern struct task_group root_task_group;
 #ifdef CONFIG_RT_MUTEXES
 # define INIT_RT_MUTEXES(tsk)  \
.pi_waiters = RB_ROOT,  \
+   .pi_top_task = NULL,\
.pi_waiters_leftmost = NULL,
 #else
 # define INIT_RT_MUTEXES(tsk)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index a464ba71a993..19a3f946caf0 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1628,6 +1628,8 @@ struct task_struct {
/* PI waiters blocked on a rt_mutex held by this task */
struct rb_root pi_waiters;
struct rb_node *pi_waiters_leftmost;
+   /* Updated under owner's pi_lock and rq lock */
+   struct task_struct  *pi_top_task;
/* Deadlock detection and priority inheritance handling */
struct rt_mutex_waiter *pi_blocked_on;
 #endif
diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
index a30b172df6e1..60d0c4740b9f 100644
--- a/include/linux/sched/rt.h
+++ b/include/linux/sched/rt.h
@@ -19,6 +19,7 @@ static inline int rt_task(struct task_struct *p)
 extern int rt_mutex_getprio(struct task_struct *p);
 extern void rt_mutex_setprio(struct task_struct *p, int prio);
 extern int rt_mutex_get_effective_prio(struct task_struct *task, int 

[PATCH] backport: sched/rtmutex/deadline: Fix a PI crash for deadline tasks

2018-11-06 Thread Henrik Austad
From: Xunlei Pang 

On some of our systems, we notice this error popping up on occasion,
completely hanging the system.

   [] enqueue_task_dl+0x1f0/0x420
   [] activate_task+0x7c/0x90
   [] push_dl_task+0x164/0x1c8
   [] push_dl_tasks+0x20/0x30
   [] __balance_callback+0x44/0x68
   [] __schedule+0x6f0/0x728
   [] schedule+0x78/0x98
   [] __rt_mutex_slowlock+0x9c/0x108
   [] rt_mutex_slowlock+0xd8/0x198
   [] rt_mutex_timed_futex_lock+0x30/0x40
   [] futex_lock_pi+0x200/0x3b0
   [] do_futex+0x1c4/0x550

It runs an 4.4 kernel on an arm64 rig. The signature looks suspciously
similar to what Xuneli Pang observed in his crash, and with this fix, my
issue goes away (my system has survivied approx 1500 reboots and a few
nasty tests so far)

Alongside this patch in the tree, there are a few other bits and pieces
pertaining to futex, rtmutex and kernel/sched/, but those patches
creates
weird crashes that I have not been able to dissect yet. Once (if) I have
been able to figure those out (and test), they will be sent later.

I am sure other users of LTS that also use sched_deadline will run into
this issue, so I think it is a good candidate for 4.4-stable. Possibly
also
to 4.9 and 4.14, but I have not had time to test for those versions.

Apart from a minor conflict in sched.h, the patch applied cleanly.

(Tested on arm64 running 4.4.)

-Henrik

A crash happened while I was playing with deadline PI rtmutex.

BUG: unable to handle kernel NULL pointer dereference at
0018
IP: [] rt_mutex_get_top_task+0x1f/0x30
PGD 232a75067 PUD 230947067 PMD 0
Oops:  [#1] SMP
CPU: 1 PID: 10994 Comm: a.out Not tainted

Call Trace:
[] enqueue_task+0x2c/0x80
[] activate_task+0x23/0x30
[] pull_dl_task+0x1d5/0x260
[] pre_schedule_dl+0x16/0x20
[] __schedule+0xd3/0x900
[] schedule+0x29/0x70
[] __rt_mutex_slowlock+0x4b/0xc0
[] rt_mutex_slowlock+0xd1/0x190
[] rt_mutex_timed_lock+0x53/0x60
[] futex_lock_pi.isra.18+0x28c/0x390
[] do_futex+0x190/0x5b0
[] SyS_futex+0x80/0x180

This is because rt_mutex_enqueue_pi() and rt_mutex_dequeue_pi()
are only protected by pi_lock when operating pi waiters, while
rt_mutex_get_top_task(), will access them with rq lock held but
not holding pi_lock.

In order to tackle it, we introduce new "pi_top_task" pointer
cached in task_struct, and add new rt_mutex_update_top_task()
to update its value, it can be called by rt_mutex_setprio()
which held both owner's pi_lock and rq lock. Thus "pi_top_task"
can be safely accessed by enqueue_task_dl() under rq lock.

Originally-From: Peter Zijlstra 
Signed-off-by: Xunlei Pang 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Steven Rostedt 
Reviewed-by: Thomas Gleixner 
Cc: juri.le...@arm.com
Cc: bige...@linutronix.de
Cc: mathieu.desnoy...@efficios.com
Cc: jdesfos...@efficios.com
Cc: bris...@redhat.com
Link: http://lkml.kernel.org/r/20170323150216.157682...@infradead.org
Signed-off-by: Thomas Gleixner 

(cherry picked from commit e96a7705e7d3fef96aec9b590c63b2f6f7d2ba22)

Conflicts:
include/linux/sched.h

Backported-and-tested-by: Henrik Austad 
Cc: Greg Kroah-Hartman 
---
 include/linux/init_task.h |  1 +
 include/linux/sched.h |  2 ++
 include/linux/sched/rt.h  |  1 +
 kernel/fork.c |  1 +
 kernel/locking/rtmutex.c  | 29 +
 kernel/sched/core.c   |  2 ++
 6 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 1c1ff7e4faa4..a561ce0c5d7f 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -162,6 +162,7 @@ extern struct task_group root_task_group;
 #ifdef CONFIG_RT_MUTEXES
 # define INIT_RT_MUTEXES(tsk)  \
.pi_waiters = RB_ROOT,  \
+   .pi_top_task = NULL,\
.pi_waiters_leftmost = NULL,
 #else
 # define INIT_RT_MUTEXES(tsk)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index a464ba71a993..19a3f946caf0 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1628,6 +1628,8 @@ struct task_struct {
/* PI waiters blocked on a rt_mutex held by this task */
struct rb_root pi_waiters;
struct rb_node *pi_waiters_leftmost;
+   /* Updated under owner's pi_lock and rq lock */
+   struct task_struct  *pi_top_task;
/* Deadlock detection and priority inheritance handling */
struct rt_mutex_waiter *pi_blocked_on;
 #endif
diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
index a30b172df6e1..60d0c4740b9f 100644
--- a/include/linux/sched/rt.h
+++ b/include/linux/sched/rt.h
@@ -19,6 +19,7 @@ static inline int rt_task(struct task_struct *p)
 extern int rt_mutex_getprio(struct task_struct *p);
 extern void rt_mutex_setprio(struct task_struct *p, int prio);
 extern int rt_mutex_get_effective_prio(struct task_struct *task, int