On Fri, Aug 01, 2025 at 09:12:08AM +0530, K Prateek Nayak wrote: > Just thinking out loud, putting this tracepoint here can lead to a > "dequeued -> dequeued" transition for fair task when they are in delayed > dequeue state. > > dequeue_task(p) > trace_dequeue_task_tp(p) # First time > dequeue_task_fair(p) > p->se.delayed = 1 > ... > <sched_switch> # p is still delayed > ... > sched_setscheduler(p) > if (prev_class != next_class && p->se.sched_delayed) > dequeue_task(p, DEQUEUE_DELAYED); > trace_dequeue_task_tp(p) # Second time > > It is not an issue as such but it might come as a surprise if users are > expecting a behavior like below which would be the case for !fair task > currently (and for all tasks before v6.12): > > digraph state_automaton { > center = true; > size = "7,11"; > {node [shape = plaintext, style=invis, label=""] > "__init_enqueue_dequeue_cycle"}; > {node [shape = ellipse] "enqueued"}; > {node [shape = ellipse] "dequeued"}; > "__init_enqueue_dequeue_cycle" -> "enqueued"; > "__init_enqueue_dequeue_cycle" -> "dequeued"; > "enqueued" [label = "enqueued", color = green3]; > "enqueued" -> "dequeued" [ label = "dequeue_task" ]; > "dequeued" [label = "dequeued", color = red]; > "dequeued" -> "enqueued" [ label = "enqueue_task" ]; > { rank = min ; > "__init_enqueue_dequeue_cycle"; > "dequeued"; > "enqueued"; > } > } > > > Another: > > "dequeued" -> "dequeued" [ label = "dequeue_task" ]; > > edge would be needed in that case for >= v6.12. It is probably nothing > and can be easily handled by the users if they run into it but just > putting it out there for the record in case you only want to consider a > complete dequeue as "dequeued". Feel free to ignore since I'm completely > out of my depth when it comes to the usage of RV in the field :)
Ah, thanks for pointing this out. I do want to only consider complete dequeue as "dequeued". These tracepoints are not visible from userspace, and RV does not care about enqueue/dequeue of fair tasks at the moment, so it is not a problem for now. But as a precaution, I trust the below patch will do. Nam diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h index c38f12f7f903..b50668052f99 100644 --- a/include/trace/events/sched.h +++ b/include/trace/events/sched.h @@ -906,6 +906,14 @@ DECLARE_TRACE(dequeue_task_rt, TP_PROTO(int cpu, struct task_struct *task), TP_ARGS(cpu, task)); +DECLARE_TRACE(enqueue_task, + TP_PROTO(int cpu, struct task_struct *task), + TP_ARGS(cpu, task)); + +DECLARE_TRACE(dequeue_task, + TP_PROTO(int cpu, struct task_struct *task), + TP_ARGS(cpu, task)); + #endif /* _TRACE_SCHED_H */ /* This part must be outside protection */ diff --git a/kernel/sched/core.c b/kernel/sched/core.c index b485e0639616..553c08a63395 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2077,6 +2077,8 @@ unsigned long get_wchan(struct task_struct *p) void enqueue_task(struct rq *rq, struct task_struct *p, int flags) { + trace_enqueue_task_tp(rq->cpu, p); + if (!(flags & ENQUEUE_NOCLOCK)) update_rq_clock(rq); @@ -2119,7 +2121,11 @@ inline bool dequeue_task(struct rq *rq, struct task_struct *p, int flags) * and mark the task ->sched_delayed. */ uclamp_rq_dec(rq, p); - return p->sched_class->dequeue_task(rq, p, flags); + if (p->sched_class->dequeue_task(rq, p, flags)) { + trace_dequeue_task_tp(rq->cpu, p); + return true; + } + return false; } void activate_task(struct rq *rq, struct task_struct *p, int flags)