Re: [PATCH v8 clocksource 3/5] clocksource: Check per-CPU clock synchronization when marked unstable

2021-04-18 Thread Paul E. McKenney
On Sat, Apr 17, 2021 at 04:51:36PM -0700, Paul E. McKenney wrote:
> On Sat, Apr 17, 2021 at 02:47:18PM +0200, Thomas Gleixner wrote:

[ . . . ]

> > > + delta = (s64)((csnow_mid - csnow_begin) & cs->mask);
> > > + if (delta < 0)
> > > + cpumask_set_cpu(cpu, _behind);
> > > + delta = (csnow_end - csnow_mid) & cs->mask;
> > > + if (delta < 0)
> > > + cpumask_set_cpu(cpu, _ahead);
> > > + delta = clocksource_delta(csnow_end, csnow_begin, cs->mask);
> > > + cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift);
> > 
> > > + if (firsttime || cs_nsec > cs_nsec_max)
> > > + cs_nsec_max = cs_nsec;
> > > + if (firsttime || cs_nsec < cs_nsec_min)
> > > + cs_nsec_min = cs_nsec;
> > > + firsttime = 0;
> > 
> >   int64_t cs_nsec_max = 0, cs_nsec_min = LLONG_MAX;
> > 
> > and then the firsttime muck is not needed at all.
> 
> Good point, will fix!
> 
> And again, thank you for looking all of this over.

And overnight testing with a 50-microsecond WATCHDOG_MAX_SKEW was
uneventful.  However, I managed to miss printing when a retry was
necessary, so all that says is that no more than three retries were
ever required.  So I added test code to print a message whenever two
or more retries are required and restarted the tests.  Shorter run,
but more systems, so hopefully similar coverage.

If that works OK, I will resend the series this evening, Pacific Time.

Thanx, Paul


Re: [PATCH v8 clocksource 3/5] clocksource: Check per-CPU clock synchronization when marked unstable

2021-04-17 Thread Paul E. McKenney
On Sat, Apr 17, 2021 at 02:47:18PM +0200, Thomas Gleixner wrote:
> On Tue, Apr 13 2021 at 21:36, Paul E. McKenney wrote:
> 
> Bah, hit send too quick.
> 
> > +   cpumask_clear(_ahead);
> > +   cpumask_clear(_behind);
> > +   preempt_disable();
> 
> Daft. 

Would migrate_disable() be better?

Yes, I know, in virtual environments, the hypervisor can migrate anyway,
but this does limit the potential damage to one out of the two schedulers.

> > +   testcpu = smp_processor_id();
> > +   pr_warn("Checking clocksource %s synchronization from CPU %d.\n", 
> > cs->name, testcpu);
> > +   for_each_online_cpu(cpu) {
> > +   if (cpu == testcpu)
> > +   continue;
> > +   csnow_begin = cs->read(cs);
> > +   smp_call_function_single(cpu, clocksource_verify_one_cpu, cs, 
> > 1);
> > +   csnow_end = cs->read(cs);
> 
> As this must run with interrupts enabled, that's a pretty rough
> approximation like measuring wind speed with a wet thumb.
> 
> Wouldn't it be smarter to let the remote CPU do the watchdog dance and
> take that result? i.e. split out more of the watchdog code so that you
> can get the nanoseconds delta on that remote CPU to the watchdog.

First, an interrupt, NMI, SMI, vCPU preemption, or whatever could
not cause a false positive.  A false negative, perhaps, but no
false positives.  Second, in normal operation, these are rare, so that
hitting the (eventual) default of eight CPUs is very likely to result in
tight bounds on the delay-based error for most of those CPUs.  Third,
we really need to compare the TSC on one CPU to the TSC on the other
in order to have a very clear indication of a problem, should a real
TSC-synchronization issue arise.  In contrast, comparisons against the
watchdog timer will be more complicated and any errors detected will be
quite hard to prove to be due to TSC issues.

Or am I once again missing something?

> > +   delta = (s64)((csnow_mid - csnow_begin) & cs->mask);
> > +   if (delta < 0)
> > +   cpumask_set_cpu(cpu, _behind);
> > +   delta = (csnow_end - csnow_mid) & cs->mask;
> > +   if (delta < 0)
> > +   cpumask_set_cpu(cpu, _ahead);
> > +   delta = clocksource_delta(csnow_end, csnow_begin, cs->mask);
> > +   cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift);
> 
> > +   if (firsttime || cs_nsec > cs_nsec_max)
> > +   cs_nsec_max = cs_nsec;
> > +   if (firsttime || cs_nsec < cs_nsec_min)
> > +   cs_nsec_min = cs_nsec;
> > +   firsttime = 0;
> 
>   int64_t cs_nsec_max = 0, cs_nsec_min = LLONG_MAX;
> 
> and then the firsttime muck is not needed at all.

Good point, will fix!

And again, thank you for looking all of this over.

Thanx, Paul


Re: [PATCH v8 clocksource 3/5] clocksource: Check per-CPU clock synchronization when marked unstable

2021-04-17 Thread Paul E. McKenney
On Sat, Apr 17, 2021 at 02:28:22PM +0200, Thomas Gleixner wrote:
> On Tue, Apr 13 2021 at 21:36, Paul E. McKenney wrote:
> > diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> > index 1fc0962c89c0..97eeaf164296 100644
> > --- a/arch/x86/kernel/kvmclock.c
> > +++ b/arch/x86/kernel/kvmclock.c
> > @@ -169,7 +169,7 @@ struct clocksource kvm_clock = {
> > .read   = kvm_clock_get_cycles,
> > .rating = 400,
> > .mask   = CLOCKSOURCE_MASK(64),
> > -   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
> > +   .flags  = CLOCK_SOURCE_IS_CONTINUOUS | CLOCK_SOURCE_VERIFY_PERCPU,
> 
> kvm_clock is not marked with CLOCK_SOURCE_MUST_VERIFY, so what's the
> point of adding this here? It's not subject to be monitored by the
> watchdog muck.

Good point, removed.

> > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> > index f70dffc2771f..56289170753c 100644
> > --- a/arch/x86/kernel/tsc.c
> > +++ b/arch/x86/kernel/tsc.c
> > @@ -1151,7 +1151,8 @@ static struct clocksource clocksource_tsc = {
> > .mask   = CLOCKSOURCE_MASK(64),
> > .flags  = CLOCK_SOURCE_IS_CONTINUOUS |
> >   CLOCK_SOURCE_VALID_FOR_HRES |
> > - CLOCK_SOURCE_MUST_VERIFY,
> > + CLOCK_SOURCE_MUST_VERIFY |
> > + CLOCK_SOURCE_VERIFY_PERCPU,
> 
> While this one is part of the horror show.

It probably would be good to decorate other arch's per-CPU clocks
with CLOCK_SOURCE_VERIFY_PERCPU, but no takers thus far.

Thanx, Paul

> > +static u64 csnow_mid;
> > +static cpumask_t cpus_ahead;
> > +static cpumask_t cpus_behind;
> > +
> > +static void clocksource_verify_one_cpu(void *csin)
> > +{
> > +   struct clocksource *cs = (struct clocksource *)csin;
> > +
> > +   csnow_mid = cs->read(cs);
> > +}
> > +
> > +static void clocksource_verify_percpu(struct clocksource *cs)
> > +{
> > +   int64_t cs_nsec, cs_nsec_max, cs_nsec_min;
> > +   u64 csnow_begin, csnow_end;
> > +   bool firsttime = 1;
> > +   int testcpu;
> > +   s64 delta;
> > +   int cpu;
> 
> int testcpu, cpu; :)
> 


Re: [PATCH v8 clocksource 3/5] clocksource: Check per-CPU clock synchronization when marked unstable

2021-04-17 Thread Thomas Gleixner
On Tue, Apr 13 2021 at 21:36, Paul E. McKenney wrote:

Bah, hit send too quick.

> + cpumask_clear(_ahead);
> + cpumask_clear(_behind);
> + preempt_disable();

Daft. 

> + testcpu = smp_processor_id();
> + pr_warn("Checking clocksource %s synchronization from CPU %d.\n", 
> cs->name, testcpu);
> + for_each_online_cpu(cpu) {
> + if (cpu == testcpu)
> + continue;
> + csnow_begin = cs->read(cs);
> + smp_call_function_single(cpu, clocksource_verify_one_cpu, cs, 
> 1);
> + csnow_end = cs->read(cs);

As this must run with interrupts enabled, that's a pretty rough
approximation like measuring wind speed with a wet thumb.

Wouldn't it be smarter to let the remote CPU do the watchdog dance and
take that result? i.e. split out more of the watchdog code so that you
can get the nanoseconds delta on that remote CPU to the watchdog.

> + delta = (s64)((csnow_mid - csnow_begin) & cs->mask);
> + if (delta < 0)
> + cpumask_set_cpu(cpu, _behind);
> + delta = (csnow_end - csnow_mid) & cs->mask;
> + if (delta < 0)
> + cpumask_set_cpu(cpu, _ahead);
> + delta = clocksource_delta(csnow_end, csnow_begin, cs->mask);
> + cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift);

> + if (firsttime || cs_nsec > cs_nsec_max)
> + cs_nsec_max = cs_nsec;
> + if (firsttime || cs_nsec < cs_nsec_min)
> + cs_nsec_min = cs_nsec;
> + firsttime = 0;

  int64_t cs_nsec_max = 0, cs_nsec_min = LLONG_MAX;

and then the firsttime muck is not needed at all.

Thanks,

tglx



Re: [PATCH v8 clocksource 3/5] clocksource: Check per-CPU clock synchronization when marked unstable

2021-04-17 Thread Thomas Gleixner
On Tue, Apr 13 2021 at 21:36, Paul E. McKenney wrote:
> diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
> index 1fc0962c89c0..97eeaf164296 100644
> --- a/arch/x86/kernel/kvmclock.c
> +++ b/arch/x86/kernel/kvmclock.c
> @@ -169,7 +169,7 @@ struct clocksource kvm_clock = {
>   .read   = kvm_clock_get_cycles,
>   .rating = 400,
>   .mask   = CLOCKSOURCE_MASK(64),
> - .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
> + .flags  = CLOCK_SOURCE_IS_CONTINUOUS | CLOCK_SOURCE_VERIFY_PERCPU,

kvm_clock is not marked with CLOCK_SOURCE_MUST_VERIFY, so what's the
point of adding this here? It's not subject to be monitored by the
watchdog muck.

> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index f70dffc2771f..56289170753c 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -1151,7 +1151,8 @@ static struct clocksource clocksource_tsc = {
>   .mask   = CLOCKSOURCE_MASK(64),
>   .flags  = CLOCK_SOURCE_IS_CONTINUOUS |
> CLOCK_SOURCE_VALID_FOR_HRES |
> -   CLOCK_SOURCE_MUST_VERIFY,
> +   CLOCK_SOURCE_MUST_VERIFY |
> +   CLOCK_SOURCE_VERIFY_PERCPU,

While this one is part of the horror show.

> +static u64 csnow_mid;
> +static cpumask_t cpus_ahead;
> +static cpumask_t cpus_behind;
> +
> +static void clocksource_verify_one_cpu(void *csin)
> +{
> + struct clocksource *cs = (struct clocksource *)csin;
> +
> + csnow_mid = cs->read(cs);
> +}
> +
> +static void clocksource_verify_percpu(struct clocksource *cs)
> +{
> + int64_t cs_nsec, cs_nsec_max, cs_nsec_min;
> + u64 csnow_begin, csnow_end;
> + bool firsttime = 1;
> + int testcpu;
> + s64 delta;
> + int cpu;

int testcpu, cpu; :)



[PATCH v8 clocksource 3/5] clocksource: Check per-CPU clock synchronization when marked unstable

2021-04-13 Thread Paul E. McKenney
Some sorts of per-CPU clock sources have a history of going out of
synchronization with each other.  However, this problem has purportedy
been solved in the past ten years.  Except that it is all too possible
that the problem has instead simply been made less likely, which might
mean that some of the occasional "Marking clocksource 'tsc' as unstable"
messages might be due to desynchronization.  How would anyone know?

Therefore apply CPU-to-CPU synchronization checking to newly unstable
clocksource that are marked with the new CLOCK_SOURCE_VERIFY_PERCPU flag.
Lists of desynchronized CPUs are printed, with the caveat that if it
is the reporting CPU that is itself desynchronized, it will appear that
all the other clocks are wrong.  Just like in real life.

Cc: John Stultz 
Cc: Thomas Gleixner 
Cc: Stephen Boyd 
Cc: Jonathan Corbet 
Cc: Mark Rutland 
Cc: Marc Zyngier 
Cc: Andi Kleen 
Reported-by: Chris Mason 
[ paulmck: Add "static" to clocksource_verify_one_cpu() per kernel test robot 
feedback. ]
[ paulmck: Apply Thomas Gleixner feedback. ]
Signed-off-by: Paul E. McKenney 
---
 arch/x86/kernel/kvmclock.c  |  2 +-
 arch/x86/kernel/tsc.c   |  3 +-
 include/linux/clocksource.h |  2 +-
 kernel/time/clocksource.c   | 63 +
 4 files changed, 67 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c
index 1fc0962c89c0..97eeaf164296 100644
--- a/arch/x86/kernel/kvmclock.c
+++ b/arch/x86/kernel/kvmclock.c
@@ -169,7 +169,7 @@ struct clocksource kvm_clock = {
.read   = kvm_clock_get_cycles,
.rating = 400,
.mask   = CLOCKSOURCE_MASK(64),
-   .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
+   .flags  = CLOCK_SOURCE_IS_CONTINUOUS | CLOCK_SOURCE_VERIFY_PERCPU,
.enable = kvm_cs_enable,
 };
 EXPORT_SYMBOL_GPL(kvm_clock);
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index f70dffc2771f..56289170753c 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1151,7 +1151,8 @@ static struct clocksource clocksource_tsc = {
.mask   = CLOCKSOURCE_MASK(64),
.flags  = CLOCK_SOURCE_IS_CONTINUOUS |
  CLOCK_SOURCE_VALID_FOR_HRES |
- CLOCK_SOURCE_MUST_VERIFY,
+ CLOCK_SOURCE_MUST_VERIFY |
+ CLOCK_SOURCE_VERIFY_PERCPU,
.vdso_clock_mode= VDSO_CLOCKMODE_TSC,
.enable = tsc_cs_enable,
.resume = tsc_resume,
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 86d143db6523..83a3ebff7456 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -131,7 +131,7 @@ struct clocksource {
 #define CLOCK_SOURCE_UNSTABLE  0x40
 #define CLOCK_SOURCE_SUSPEND_NONSTOP   0x80
 #define CLOCK_SOURCE_RESELECT  0x100
-
+#define CLOCK_SOURCE_VERIFY_PERCPU 0x200
 /* simplify initialization of mask field */
 #define CLOCKSOURCE_MASK(bits) GENMASK_ULL((bits) - 1, 0)
 
diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 3ac19a7859f5..69311deab428 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -211,6 +211,63 @@ static void clocksource_watchdog_inject_delay(void)
WARN_ON_ONCE(injectfail < 0);
 }
 
+static u64 csnow_mid;
+static cpumask_t cpus_ahead;
+static cpumask_t cpus_behind;
+
+static void clocksource_verify_one_cpu(void *csin)
+{
+   struct clocksource *cs = (struct clocksource *)csin;
+
+   csnow_mid = cs->read(cs);
+}
+
+static void clocksource_verify_percpu(struct clocksource *cs)
+{
+   int64_t cs_nsec, cs_nsec_max, cs_nsec_min;
+   u64 csnow_begin, csnow_end;
+   bool firsttime = 1;
+   int testcpu;
+   s64 delta;
+   int cpu;
+
+   cpumask_clear(_ahead);
+   cpumask_clear(_behind);
+   preempt_disable();
+   testcpu = smp_processor_id();
+   pr_warn("Checking clocksource %s synchronization from CPU %d.\n", 
cs->name, testcpu);
+   for_each_online_cpu(cpu) {
+   if (cpu == testcpu)
+   continue;
+   csnow_begin = cs->read(cs);
+   smp_call_function_single(cpu, clocksource_verify_one_cpu, cs, 
1);
+   csnow_end = cs->read(cs);
+   delta = (s64)((csnow_mid - csnow_begin) & cs->mask);
+   if (delta < 0)
+   cpumask_set_cpu(cpu, _behind);
+   delta = (csnow_end - csnow_mid) & cs->mask;
+   if (delta < 0)
+   cpumask_set_cpu(cpu, _ahead);
+   delta = clocksource_delta(csnow_end, csnow_begin, cs->mask);
+   cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift);
+   if (firsttime || cs_nsec > cs_nsec_max)
+   cs_nsec_max = cs_nsec;
+   if (firsttime || cs_nsec <