On Wed, May 20, 2026, David Woodhouse wrote:
> On Wed, 2026-05-20 at 13:44 -0700, Sean Christopherson wrote:
> > 
> > +       /*
> > +        * If the TSC counts at a constant frequency across P/T states, 
> > counts
> > +        * in deep C-states, and the TSC hasn't been marked unstable, treat 
> > the
> > +        * TSC reliable, as guaranteed by KVM.  Note, the TSC unstable check
> > +        * exists purely to honor the TSC being marked unstable via command
> > +        * line, any runtime detection of an unstable will happen after 
> > this.
> > +        */
> > +       if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> > +           boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> > +           !check_tsc_unstable())
>     { 
> > +               tsc_properties = TSC_FREQ_KNOWN_AND_RELIABLE;
> 
>     kvmclock = 0; /* Why use it if the TSC works? The kvmclock exists
>                      *purely* to work around a TSC which *doesn't*
>                      have those properties checked above. */

kvmclock still provides SYSTEM_TIME and WALL_CLOCK :-/

>     }
> 
> I was going to say PVCLOCK_TSC_STABLE_BIT, and maybe we should check
> that *too* for paranoia?

No? PVCLOCK_TSC_STABLE is property of kvmclock more than it's a property of the
TSC itself.  And for the no-kvmclock case, we most definitely don't want to 
setup
kvmclock just to query that flag.

> But hopefully the checks you have above are equivalent?

They aren't as paranoid, but if the host enumerates CONSTANT+NONSTOP TSC despite
KVM-the-host not being able to advertise PVCLOCK_TSC_STABLE_BIT, then the VMM 
done
messed up.

> > +
> > +       kvm_tsc_khz_cpuid = kvm_para_tsc_khz();
> > +
> > +       /*
> > +        * If provided, use the TSC (and APIC bus) frequency provided in 
> > KVM's
> > +        * PV CPUID leaf even if kvmclock itself is disabled via command 
> > line.
> > +        * The PV CPUID information isn't dependent on kvmclock in any way, 
> > and
> > +        * in fact using the precise information is *more* important when 
> > the
> > +        * user has explicitly disabled kvmclock to force the kernel to use 
> > the
> > +        * TSC as its clocksource.
> > +        */
> > +       if (!kvmclock) {
> > +               if (kvm_tsc_khz_cpuid)
> > +                       tsc_register_calibration_routines(kvm_get_tsc_khz,
> > +                                                         kvm_get_cpu_khz,
> > +                                                         tsc_properties);
> > +               return;
> > +       }
> > +
> 
> 
> Regardless of the above, why not just register these here
> unconditionally, and remove the later call that does the same?

Because if kvmclock=n, it's only safe to call kvm_get_tsc_khz() if 
kvm_tsc_khz_cpuid
is non-zero, other wise the "else" path will hit a NULL pointer deref when 
trying
to get the frequency from the PV clock struct:

        return kvm_tsc_khz_cpuid ? : pvclock_tsc_khz(this_cpu_pvti());

Reply via email to