On Mon, Nov 23, 2020 at 09:41:23AM +1100, Balbir Singh wrote:
> On Tue, Nov 17, 2020 at 06:19:40PM -0500, Joel Fernandes (Google) wrote:
> > From: Peter Zijlstra <[email protected]>
> > 
> > The rationale is as follows. In the core-wide pick logic, even if
> > need_sync == false, we need to go look at other CPUs (non-local CPUs) to
> > see if they could be running RT.
> > 
> > Say the RQs in a particular core look like this:
> > Let CFS1 and CFS2 be 2 tagged CFS tags. Let RT1 be an untagged RT task.
> > 
> > rq0            rq1
> > CFS1 (tagged)  RT1 (not tag)
> > CFS2 (tagged)
> > 
> > Say schedule() runs on rq0. Now, it will enter the above loop and
> > pick_task(RT) will return NULL for 'p'. It will enter the above if() block
> > and see that need_sync == false and will skip RT entirely.
> > 
> > The end result of the selection will be (say prio(CFS1) > prio(CFS2)):
> > rq0             rq1
> > CFS1            IDLE
> > 
> > When it should have selected:
> > rq0             r1
> > IDLE            RT
> > 
> > Joel saw this issue on real-world usecases in ChromeOS where an RT task
> > gets constantly force-idled and breaks RT. Lets cure it.
> > 
> > NOTE: This problem will be fixed differently in a later patch. It just
> >       kept here for reference purposes about this issue, and to make
> >       applying later patches easier.
> >
> 
> The changelog is hard to read, it refers to above if(), whereas there
> is no code snippet in the changelog.

Yeah sorry, it comes from this email where I described the issue:
http://lore.kernel.org/r/[email protected]

I corrected the changelog and appended the patch below. Also pushed it to:
https://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux.git/log/?h=coresched

> Also, from what I can see following
> the series, p->core_cookie is not yet set anywhere (unless I missed it),
> so fixing it in here did not make sense just reading the series.

The interface patches for core_cookie are added later, that's how it is. The
infrastructure comes first here. It would also not make sense to add
interface first as well so I think the current ordering is fine.

---8<-----------------------

From: Peter Zijlstra <[email protected]>
Subject: [PATCH] sched: Fix priority inversion of cookied task with sibling

The rationale is as follows. In the core-wide pick logic, even if
need_sync == false, we need to go look at other CPUs (non-local CPUs) to
see if they could be running RT.

Say the RQs in a particular core look like this:
Let CFS1 and CFS2 be 2 tagged CFS tags. Let RT1 be an untagged RT task.

rq0            rq1
CFS1 (tagged)  RT1 (not tag)
CFS2 (tagged)

The end result of the selection will be (say prio(CFS1) > prio(CFS2)):
rq0             rq1
CFS1            IDLE

When it should have selected:
rq0             r1
IDLE            RT

Fix this issue by forcing need_sync and restarting the search if a
cookied task was discovered. This will avoid this optimization from
making incorrect picks.

Joel saw this issue on real-world usecases in ChromeOS where an RT task
gets constantly force-idled and breaks RT. Lets cure it.

NOTE: This problem will be fixed differently in a later patch. It just
      kept here for reference purposes about this issue, and to make
      applying later patches easier.

Reported-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Peter Zijlstra <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
 kernel/sched/core.c | 25 ++++++++++++++++---------
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4ee4902c2cf5..53af817740c0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5195,6 +5195,7 @@ pick_next_task(struct rq *rq, struct task_struct *prev, 
struct rq_flags *rf)
        need_sync = !!rq->core->core_cookie;
 
        /* reset state */
+reset:
        rq->core->core_cookie = 0UL;
        if (rq->core->core_forceidle) {
                need_sync = true;
@@ -5242,14 +5243,8 @@ pick_next_task(struct rq *rq, struct task_struct *prev, 
struct rq_flags *rf)
                                /*
                                 * If there weren't no cookies; we don't need to
                                 * bother with the other siblings.
-                                * If the rest of the core is not running a 
tagged
-                                * task, i.e.  need_sync == 0, and the current 
CPU
-                                * which called into the schedule() loop does 
not
-                                * have any tasks for this class, skip 
selecting for
-                                * other siblings since there's no point. We 
don't skip
-                                * for RT/DL because that could make CFS 
force-idle RT.
                                 */
-                               if (i == cpu && !need_sync && class == 
&fair_sched_class)
+                               if (i == cpu && !need_sync)
                                        goto next_class;
 
                                continue;
@@ -5259,7 +5254,20 @@ pick_next_task(struct rq *rq, struct task_struct *prev, 
struct rq_flags *rf)
                         * Optimize the 'normal' case where there aren't any
                         * cookies and we don't need to sync up.
                         */
-                       if (i == cpu && !need_sync && !p->core_cookie) {
+                       if (i == cpu && !need_sync) {
+                               if (p->core_cookie) {
+                                       /*
+                                        * This optimization is only valid as
+                                        * long as there are no cookies
+                                        * involved. We may have skipped
+                                        * non-empty higher priority classes on
+                                        * siblings, which are empty on this
+                                        * CPU, so start over.
+                                        */
+                                       need_sync = true;
+                                       goto reset;
+                               }
+
                                next = p;
                                goto done;
                        }
@@ -5299,7 +5307,6 @@ pick_next_task(struct rq *rq, struct task_struct *prev, 
struct rq_flags *rf)
                                         */
                                        need_sync = true;
                                }
-
                        }
                }
 next_class:;
-- 
2.29.2.454.gaff20da3a2-goog

Reply via email to