In ctx_resched(), EVENT_FLEXIBLE should be sched_out when EVENT_PINNED is
added. However, ctx_resched() calculates ctx_event_type before checking
this condition. As a result, pinned events will NOT get higher priority
than flexible events.

The following shows this issue on an Intel CPU (where ref-cycles can
only use one hardware counter).

  1. First start:
       perf stat -C 0 -e ref-cycles  -I 1000
  2. Then, in the second console, run:
       perf stat -C 0 -e ref-cycles:D -I 1000

The second perf uses pinned events, which is expected to have higher
priority. However, because it failed in ctx_resched(). It is never
run.

This patch fixes this by calculating ctx_event_type after re-evaluating
event_type.

Fixes: 487f05e18aa4 ("perf/core: Optimize event rescheduling on active 
contexts")
Signed-off-by: Song Liu <songliubrav...@fb.com>
Reported-by: Ephraim Park <ephiep...@fb.com>
---
 kernel/events/core.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5789810..cf52fc0 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2246,7 +2246,7 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
                        struct perf_event_context *task_ctx,
                        enum event_type_t event_type)
 {
-       enum event_type_t ctx_event_type = event_type & EVENT_ALL;
+       enum event_type_t ctx_event_type;
        bool cpu_event = !!(event_type & EVENT_CPU);
 
        /*
@@ -2256,6 +2256,8 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
        if (event_type & EVENT_PINNED)
                event_type |= EVENT_FLEXIBLE;
 
+       ctx_event_type = event_type & EVENT_ALL;
+
        perf_pmu_disable(cpuctx->ctx.pmu);
        if (task_ctx)
                task_ctx_sched_out(cpuctx, task_ctx, event_type);
-- 
2.9.5

Reply via email to