On Thu, Aug 07, 2025 at 08:27:32PM +0000, Michael Kelley wrote:
> From: wei....@kernel.org <wei....@kernel.org> Sent: Thursday, August 7, 2025 
> 9:59 AM
> > 
> > There is no HV_ACCESS_TSC_INVARIANT bit when Linux runs as the root
> > partition. 
> 
> Some clarifying questions here: When you say "there is no
> HV_ACCESS_TSC_INVARIANT bit", does that mean that bit 15 of the
> HV_PARTITION_PRIVILEGE_MASK is just unused and undefined?

The HV_ACCESS_TSC_INVARIANT bit is still defined, but it is always zero
for the root partition. I can modify the commit message and code comment
to clarify that.

> 
> And what is the behavior if the root partition writes to
> HV_X64_MSR_TSC_INVARIANT_CONTROL? In a normal x86 guest,
> HV_X64_MSR_TSC_INVARIANT_CONTROL determines whether
> CPUID 0x80000007/EDX bit 8 is set. What will the root partition see
> for CPUID 0x80000007/EDX bit 8? Whatever the underlying hardware
> provides? See also the comment in ms_hyperv_init_platform().
> 

The root partition sees whatever the underlying hardware provides. It
doesn't need to write write to that MSR.

I think it should be fine to skip the code in ms_hyperv_init_platform().

Thanks,
Wei

> Michael
> 
> > The old logic caused the native TSC clock source to be
> > incorrectly marked as unstable on x86.
> > 
> > The clock source driver runs on both x86 and ARM64. Change it to prefer
> > architectural counter when it runs on Linux root.
> > 
> > Signed-off-by: Wei Liu <wei....@kernel.org>
> > ---
> > Cc: Michael Kelley <mhkli...@outlook.com>
> > 
> > Pending further testing.
> > 
> > The preference of architectural counter over Hyper-V Reference TSC for
> > Linux root is confirmed by the hypervisor team.
> > ---
> >  arch/x86/kernel/cpu/mshyperv.c     |  6 +++++-
> >  drivers/clocksource/hyperv_timer.c | 10 +++++++++-
> >  2 files changed, 14 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> > index fd708180d2d9..1713545dcf4a 100644
> > --- a/arch/x86/kernel/cpu/mshyperv.c
> > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > @@ -966,8 +966,12 @@ static void __init ms_hyperv_init_platform(void)
> >      * TSC should be marked as unstable only after Hyper-V
> >      * clocksource has been initialized. This ensures that the
> >      * stability of the sched_clock is not altered.
> > +    *
> > +    * The root partition doesn't see HV_ACCESS_TSC_INVARIANT.
> > +    * No need to check for it.
> >      */
> > -   if (!(ms_hyperv.features & HV_ACCESS_TSC_INVARIANT))
> > +   if (!hv_root_partition() &&
> > +       !(ms_hyperv.features & HV_ACCESS_TSC_INVARIANT))
> >             mark_tsc_unstable("running on Hyper-V");
> > 
> >     hardlockup_detector_disable();
> > diff --git a/drivers/clocksource/hyperv_timer.c 
> > b/drivers/clocksource/hyperv_timer.c
> > index f6415e726e96..59c3e09f1961 100644
> > --- a/drivers/clocksource/hyperv_timer.c
> > +++ b/drivers/clocksource/hyperv_timer.c
> > @@ -534,14 +534,22 @@ static void __init hv_init_tsc_clocksource(void)
> >     union hv_reference_tsc_msr tsc_msr;
> > 
> >     /*
> > +    * When running as a guest partition:
> > +    *
> >      * If Hyper-V offers TSC_INVARIANT, then the virtualized TSC correctly
> >      * handles frequency and offset changes due to live migration,
> >      * pause/resume, and other VM management operations.  So lower the
> >      * Hyper-V Reference TSC rating, causing the generic TSC to be used.
> >      * TSC_INVARIANT is not offered on ARM64, so the Hyper-V Reference
> >      * TSC will be preferred over the virtualized ARM64 arch counter.
> > +    *
> > +    * When running as the root partition:
> > +    *
> > +    * There is no HV_ACCESS_TSC_INVARIANT feature. Always prefer the
> > +    * architectural defined counter over the Hyper-V Reference TSC.
> >      */
> > -   if (ms_hyperv.features & HV_ACCESS_TSC_INVARIANT) {
> > +   if ((ms_hyperv.features & HV_ACCESS_TSC_INVARIANT) ||
> > +       hv_root_partition()) {
> >             hyperv_cs_tsc.rating = 250;
> >             hyperv_cs_msr.rating = 245;
> >     }
> > --
> > 2.43.0
> 

Reply via email to