On Sun, Sep 28, 2025 at 6:43 PM Xuewen Yan <[email protected]> wrote: > > Add trace point to psi triggers. This is useful to > observe the psi events in the kernel space. > > One use of this is to monitor memory pressure. > When the pressure is too high, we can kill the process > in the kernel space to prevent OOM.
Just FYI, Roman is working on a BPF-based oom-killer solution [1] which might be also interesting for you and this tracepoint might be useful for Roman as well. CC'ing him here. [1] https://lore.kernel.org/all/[email protected]/ > > Signed-off-by: Xuewen Yan <[email protected]> Acked-by: Suren Baghdasaryan <[email protected]> > --- > V4: > -generate the event only after cmpxchg() passes the check > --- > V3: > -export it in the tracefs; > --- > v2: > -fix compilation error; > -export the tp; > -add more commit message; > --- > include/trace/events/sched.h | 27 +++++++++++++++++++++++++++ > kernel/sched/psi.c | 5 +++++ > 2 files changed, 32 insertions(+) > > diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h > index 7b2645b50e78..db8b8f25466e 100644 > --- a/include/trace/events/sched.h > +++ b/include/trace/events/sched.h > @@ -826,6 +826,33 @@ TRACE_EVENT(sched_wake_idle_without_ipi, > TP_printk("cpu=%d", __entry->cpu) > ); > > +#ifdef CONFIG_PSI > +TRACE_EVENT(psi_event, > + > + TP_PROTO(int aggregator, int state, u64 threshold, u64 win_size), > + > + TP_ARGS(aggregator, state, threshold, win_size), > + > + TP_STRUCT__entry( > + __field(int, aggregator) > + __field(int, state) > + __field(u64, threshold) > + __field(u64, win_size) > + ), > + > + TP_fast_assign( > + __entry->aggregator = aggregator; > + __entry->state = state; > + __entry->threshold = threshold; > + __entry->win_size = win_size; > + ), > + > + TP_printk("aggregator=%d state=%d threshold=%llu window_size=%llu", > + __entry->aggregator, __entry->state, __entry->threshold, > + __entry->win_size) > +); > +#endif /* CONFIG_PSI */ > + > /* > * Following tracepoints are not exported in tracefs and provide hooking > * mechanisms only for testing and debugging purposes. > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > index 59fdb7ebbf22..e8a7fd04ba9f 100644 > --- a/kernel/sched/psi.c > +++ b/kernel/sched/psi.c > @@ -141,6 +141,8 @@ > #include <linux/psi.h> > #include "sched.h" > > +EXPORT_TRACEPOINT_SYMBOL_GPL(psi_event); > + > static int psi_bug __read_mostly; > > DEFINE_STATIC_KEY_FALSE(psi_disabled); > @@ -515,6 +517,9 @@ static void update_triggers(struct psi_group *group, u64 > now, > kernfs_notify(t->of->kn); > else > wake_up_interruptible(&t->event_wait); > + > + trace_psi_event(aggregator, t->state, t->threshold, > + t->win.size); > } > t->last_event_time = now; > /* Reset threshold breach flag once event got generated */ > -- > 2.25.1 > >
