Re: [PATCH v3] x86/tsc: add X86_FEATURE_TSC_KNOWN_FREQ flag

2016-10-21 Thread Thomas Gleixner
On Fri, 21 Oct 2016, Peter Zijlstra wrote:
> On Thu, Oct 20, 2016 at 09:37:50PM +0200, Thomas Gleixner wrote:
> 
> > Well, we have the same issue on other platforms/models which set the
> > reliable flag.
> 
> I was not aware we had other platforms doing this, git grep tells me
> intel-mid does this as well..
> 
> > So one sanity check we can do is to read the IA32_TSC_ADJUST MSR on all
> > cores. They should all have the same value (usually 0) or at least have a
> > very minimal delta. If that's off by more than 1us then something is fishy
> > especially on single socket systems. We could at least WARN about it.
> > 
> > We could do this in idle occasionally as well, so we can detect the dreaded
> > "SMI wants to hide the cycles" crapola.
> 
> Indeed, that sounds like the best we can; and probably should; do.

I'll have a look at that in the next days.

Thanks,

tglx
 


Re: [PATCH v3] x86/tsc: add X86_FEATURE_TSC_KNOWN_FREQ flag

2016-10-20 Thread Peter Zijlstra
On Thu, Oct 20, 2016 at 09:37:50PM +0200, Thomas Gleixner wrote:

> Well, we have the same issue on other platforms/models which set the
> reliable flag.

I was not aware we had other platforms doing this, git grep tells me
intel-mid does this as well..

> So one sanity check we can do is to read the IA32_TSC_ADJUST MSR on all
> cores. They should all have the same value (usually 0) or at least have a
> very minimal delta. If that's off by more than 1us then something is fishy
> especially on single socket systems. We could at least WARN about it.
> 
> We could do this in idle occasionally as well, so we can detect the dreaded
> "SMI wants to hide the cycles" crapola.

Indeed, that sounds like the best we can; and probably should; do.


Re: [PATCH v3] x86/tsc: add X86_FEATURE_TSC_KNOWN_FREQ flag

2016-10-20 Thread Thomas Gleixner
On Thu, 20 Oct 2016, Peter Zijlstra wrote:
> On Thu, Oct 20, 2016 at 11:57:03AM +0200, Thomas Gleixner wrote:
> > On Thu, 13 Oct 2016, Bin Gao wrote:
> > > @@ -702,6 +702,15 @@ unsigned long native_calibrate_tsc(void)
> > >   }
> > >   }
> > >  
> > > + setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
> > > +
> > > + /*
> > > +  * For Atom SoCs TSC is the only reliable clocksource.
> > > +  * Mark TSC reliable so no watchdog on it.
> > > +  */
> > > + if (boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT)
> > > + setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> > > +
> 
> AFAICT setting TSC_RELIABLE also skips the check_tsc_warp() tests in
> tsc_sync.c.
> 
> This means that if someone does a Goldmont BIOS with 'features', we'll
> never detect the wreckage :-/

Well, we have the same issue on other platforms/models which set the
reliable flag.

So one sanity check we can do is to read the IA32_TSC_ADJUST MSR on all
cores. They should all have the same value (usually 0) or at least have a
very minimal delta. If that's off by more than 1us then something is fishy
especially on single socket systems. We could at least WARN about it.

We could do this in idle occasionally as well, so we can detect the dreaded
"SMI wants to hide the cycles" crapola.

Thanks,

tglx


Re: [PATCH v3] x86/tsc: add X86_FEATURE_TSC_KNOWN_FREQ flag

2016-10-20 Thread Peter Zijlstra
On Thu, Oct 20, 2016 at 11:57:03AM +0200, Thomas Gleixner wrote:
> On Thu, 13 Oct 2016, Bin Gao wrote:
> > @@ -702,6 +702,15 @@ unsigned long native_calibrate_tsc(void)
> > }
> > }
> >  
> > +   setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
> > +
> > +   /*
> > +* For Atom SoCs TSC is the only reliable clocksource.
> > +* Mark TSC reliable so no watchdog on it.
> > +*/
> > +   if (boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT)
> > +   setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> > +

AFAICT setting TSC_RELIABLE also skips the check_tsc_warp() tests in
tsc_sync.c.

This means that if someone does a Goldmont BIOS with 'features', we'll
never detect the wreckage :-/


Re: [PATCH v3] x86/tsc: add X86_FEATURE_TSC_KNOWN_FREQ flag

2016-10-20 Thread Thomas Gleixner
On Thu, 13 Oct 2016, Bin Gao wrote:
> @@ -702,6 +702,15 @@ unsigned long native_calibrate_tsc(void)
>   }
>   }
>  
> + setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
> +
> + /*
> +  * For Atom SoCs TSC is the only reliable clocksource.
> +  * Mark TSC reliable so no watchdog on it.
> +  */
> + if (boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT)
> + setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> +

Right. That's what I wanted to see, but please split this into two patches:

  #1 Split the TSC flags
  #2 Set the flag for Goldmont

We do not mix design changes with hw support changes.

Thanks,

tglx