From: "Steven Rostedt (Red Hat)" <[email protected]>

Because tracepoint callbacks are done with preemption enabled, the trace
events are always called with preempt disable due to the
rcu_read_lock_sched_notrace() in __DO_TRACE(). This causes the preempt count
shown in the recorded trace event to be inaccurate. It is always one more
that what the preempt_count was when the tracepoint was called.

If CONFIG_PREEMPT is enabled, subtract 1 from the preempt_count before
recording it in the trace buffer.

Link: http://lkml.kernel.org/r/[email protected]

Reported-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
---
 kernel/trace/trace_events.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index fd449eb138cf..03c0a48c3ac4 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -261,6 +261,14 @@ void *trace_event_buffer_reserve(struct trace_event_buffer 
*fbuffer,
 
        local_save_flags(fbuffer->flags);
        fbuffer->pc = preempt_count();
+       /*
+        * If CONFIG_PREEMPT is enabled, then the tracepoint itself disables
+        * preemption (adding one to the preempt_count). Since we are
+        * interested in the preempt_count at the time the tracepoint was
+        * hit, we need to subtract one to offset the increment.
+        */
+       if (IS_ENABLED(CONFIG_PREEMPT))
+               fbuffer->pc--;
        fbuffer->trace_file = trace_file;
 
        fbuffer->event =
-- 
2.8.1


Reply via email to