In commit 27727df240c7 ("Avoid taking lock in NMI path with
CONFIG_DEBUG_TIMEKEEPING"), I changed the logic to open-code
the timekeeping_get_ns() function, but I forgot to include
the unit conversion from cycles to nanoseconds, breaking the
function's output, which impacts users like perf.

This would result in bogus perf timestamps like:
 swapper     0 [000]   253.427536:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.426573:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.426687:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.426800:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.426905:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.427022:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.427127:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.427239:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.427346:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   254.427463:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]   255.426572:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])

Instead of more reasonable expected timestamps like:
 swapper     0 [000]    39.953768:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.064839:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.175956:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.287103:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.398217:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.509324:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.620437:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.731546:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.842654:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    40.953772:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])
 swapper     0 [000]    41.064881:  111111111 cpu-clock:  ffffffff810a0de6 
native_safe_halt+0x6 ([kernel.kallsyms])

This patch adds the proper use of timekeeping_delta_to_ns()
to convert the cycle delta to nanoseconds as needed.

Thanks to Brendan and Alexei for finding this quickly after
the v4.8 release. Unfortunately the problematic commit has
landed in some -stable trees so they'll need this fix as
well.

Many apologies for this mistake. I'll be looking to add a
perf-clock sanity test to the kselftest timers tests soon.

Cc: Steven Rostedt <rost...@goodmis.org>
Cc: Peter Zijlstra <pet...@infradead.org>
Cc: Ingo Molnar <mi...@kernel.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: stable <sta...@vger.kernel.org>
Cc: Brendan Gregg <bgr...@netflix.com>
Cc: Alexei Starovoitov <alexei.starovoi...@gmail.com>
Fixes: 27727df240c7 "timekeeping: Avoid taking lock in NMI path with 
CONFIG_DEBUG_TIMEKEEPING"
Reported-by: Brendan Gregg <bgr...@netflix.com>
Reported-by: Alexei Starovoitov <alexei.starovoi...@gmail.com>
Signed-off-by: John Stultz <john.stu...@linaro.org>
---
Thomas/Ingo:
I've reproduced the issue and validated this fixes it, but given my limited perf
usage so far, I'd appreciate waiting for a Tested-by: from one of the reporters
before considering for tip/timers/urgent.

 kernel/time/timekeeping.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index e07fb09..37dec7e 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -403,8 +403,11 @@ static __always_inline u64 __ktime_get_fast_ns(struct 
tk_fast *tkf)
                tkr = tkf->base + (seq & 0x01);
                now = ktime_to_ns(tkr->base);
 
-               now += clocksource_delta(tkr->read(tkr->clock),
-                                        tkr->cycle_last, tkr->mask);
+               now += timekeeping_delta_to_ns(tkr,
+                               clocksource_delta(
+                                       tkr->read(tkr->clock),
+                                       tkr->cycle_last,
+                                       tkr->mask));
        } while (read_seqcount_retry(&tkf->seq, seq));
 
        return now;
-- 
1.9.1

Reply via email to