On Tue, May 19, 2026, David Woodhouse wrote:
> On Fri, 2026-05-15 at 12:19 -0700, Sean Christopherson wrote:
> > Dave/Thomas/Peter/Boris, what's the going rate for bribes to take something
> > like this through the tip tree?
> > 
> > The bulk of the changes are in kvmclock and TSC, but pretty much every
> > hypervisor's guest-side code gets touched at some point.  I am reaonsably
> > confident in the correctness of the KVM changes.  Michael tested Hyper-V in
> > v2, and while there were conflicts when rebasing, they were largely
> > superficial (and I've just jinxed myself).  For all other hypervisors, 
> > assume
> > the code is compile-tested only, but those changes are all quite small and
> > straightforward.
> > 
> > The only changes that are questionable/contentious are the last two patches,
> > which have KVM-as-a-guest use CPUID 0x16 to get the CPU frequency, even on
> > AMD (that's the dubious part).  I very deliberately put them last, so that
> > they can be dropped at will (I don't care terribly if those patches land).
> > To merge them, I would want explicit Acks from Paolo and David W.
> > 
> > So, except for the last two patches, to get the stuff I really care about
> > landed, I think/hope it's just the TSC and guest-side CoCo changes that need
> > reviews/acks?
> > 
> > The primary goal of this series is (or at least was, when I started) to
> > fix flaws with SNP and TDX guests where a PV clock provided by the untrusted
> > hypervisor is used instead of the secure/trusted TSC that is controlled by
> > trusted firmware.
> > 
> > The secondary goal is to draft off of the SNP and TDX changes to slightly
> > modernize running under KVM.  Currently, KVM guests will use TSC for
> > clocksource, but not sched_clock.  And they ignore Intel's CPUID-based TSC
> > and CPU frequency enumeration, even when using the TSC instead of kvmclock.
> > And if the host provides the core crystal frequency in CPUID.0x15, then KVM
> > guests can use that for the APIC timer period instead of manually 
> > calibrating
> > the frequency.
> > 
> > The tertiary goal is to clean up all of the PV clock code to deduplicate 
> > logic
> > across hypervisors, and to hopefully make it all easier to maintain going
> > forward.
> 
> I booted this in qemu with -cpu host,+invtsc,+vmware-cpuid-freq
> 
> I was expecting to see it eschew the kvmclock and use *only* the TSC.
> Is there even any need for 'tsc-early' given that it's *told* the TSC
> frequency in CPUID? Shouldn't it have detected that the TSC is known
> before init_tsc_clocksource() runs?
>
> And then it even spent some time at boot actually using the kvmclock as
> clocksource... when ideally I don't think it would even have *enabled*
> it at all?

Yeah, that's definitely the ideal state.  And I had all the same expectations 
and
observations as you when digging in and testing this.  But unless this series
makes things worse, I want punt on achieving the ideal state for the moment, as
it's proving to be a big lift just to get to a not-awful state.

> [    0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 
> 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> [    0.000000] tsc: Detected 2400.000 MHz processor
> [    0.008205] TSC deadline timer available
> [    0.008270] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 
> 0xffffffff, max_idle_ns: 1910969940391419 ns
> [    0.159085] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, 
> max_idle_ns: 19112604467 ns
> [    0.164074] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 
> 0x22983777dd9, max_idle_ns: 440795300422 ns
> [    0.229087] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, 
> max_idle_ns: 1911260446275000 ns
> [    0.337095] clocksource: Switched to clocksource kvm-clock
> [    0.345246] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, 
> max_idle_ns: 2085701024 ns
> [    0.356201] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 
> 0x22983777dd9, max_idle_ns: 440795300422 ns
> [    0.360560] clocksource: Switched to clocksource tsc
> 



Reply via email to