Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data

2025-05-16 Thread Jan Beulich
On 16.05.2025 12:35, Penny, Zheng wrote:
>> -Original Message-
>> From: Jan Beulich 
>> Sent: Monday, April 28, 2025 11:57 PM
>>
>> On 14.04.2025 09:40, Penny Zheng wrote:
>>> @@ -459,6 +464,26 @@ struct xen_processor_performance {  typedef
>>> struct xen_processor_performance xen_processor_performance_t;
>>> DEFINE_XEN_GUEST_HANDLE(xen_processor_performance_t);
>>>
>>> +struct xen_processor_cppc {
>>> +uint8_t flags; /* flag for CPPC sub info type */
>>> +/*
>>> + * Subset _CPC fields useful for CPPC-compatible cpufreq
>>> + * driver's initialization
>>> + */
>>> +struct {
>>> +uint32_t highest_perf;
>>> +uint32_t nominal_perf;
>>> +uint32_t lowest_nonlinear_perf;
>>> +uint32_t lowest_perf;
>>> +uint32_t lowest_mhz;
>>> +uint32_t nominal_mhz;
>>> +} cpc;
>>> +struct xen_psd_package domain_info; /* _PSD */
>>
>> This being a member of the new type, ...
>>
>>> --- a/xen/include/xlat.lst
>>> +++ b/xen/include/xlat.lst
>>> @@ -168,6 +168,7 @@
>>>  !  processor_performance   platform.h
>>>  !  processor_power platform.h
>>>  ?  processor_pxplatform.h
>>> +?  processor_cppc  platform.h
>>
>> ... how can it be ? here when it's ...
>>
>>>  !  psd_package platform.h
>>
>> ... ! here? And with it being ?, you're lacking a place where you invoke the 
>> resulting
>> checking macro (which I assume would cause a build failure).

I guess this wasn't clear enough then: Aiui you cannot "check" these, because 
the
native and compat ones are going to be different. You'll need to use ! here, and
then use the respective XLAT_* macro(s). IOW ...

> Understood, I see it automatically generates CHECK_psd_package. I shall 
> change psd_package with ? too
> In order to avoid causing build failure, I add "typedef struct 
> xen_psd_package xen_psd_package_t;"
> I'm not familiar with the compat framework, if it isn't the right way to fix, 
> plz let me know

... I expect this isn't the correct way of dealing with it.

Jan



RE: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data

2025-05-16 Thread Penny, Zheng
[Public]

Hi

> -Original Message-
> From: Jan Beulich 
> Sent: Monday, April 28, 2025 11:57 PM
> To: Penny, Zheng 
> Cc: Huang, Ray ; Andrew Cooper
> ; Roger Pau Monné ;
> Anthony PERARD ; Orzel, Michal
> ; Julien Grall ; Stefano Stabellini
> ; xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to 
> propagate
> CPPC data
>
> On 14.04.2025 09:40, Penny Zheng wrote:
> > @@ -459,6 +464,26 @@ struct xen_processor_performance {  typedef
> > struct xen_processor_performance xen_processor_performance_t;
> > DEFINE_XEN_GUEST_HANDLE(xen_processor_performance_t);
> >
> > +struct xen_processor_cppc {
> > +uint8_t flags; /* flag for CPPC sub info type */
> > +/*
> > + * Subset _CPC fields useful for CPPC-compatible cpufreq
> > + * driver's initialization
> > + */
> > +struct {
> > +uint32_t highest_perf;
> > +uint32_t nominal_perf;
> > +uint32_t lowest_nonlinear_perf;
> > +uint32_t lowest_perf;
> > +uint32_t lowest_mhz;
> > +uint32_t nominal_mhz;
> > +} cpc;
> > +struct xen_psd_package domain_info; /* _PSD */
>
> This being a member of the new type, ...
>
> > --- a/xen/include/xlat.lst
> > +++ b/xen/include/xlat.lst
> > @@ -168,6 +168,7 @@
> >  !  processor_performance   platform.h
> >  !  processor_power platform.h
> >  ?  processor_pxplatform.h
> > +?  processor_cppc  platform.h
>
> ... how can it be ? here when it's ...
>
> >  !  psd_package platform.h
>
> ... ! here? And with it being ?, you're lacking a place where you invoke the 
> resulting
> checking macro (which I assume would cause a build failure).
>

Understood, I see it automatically generates CHECK_psd_package. I shall change 
psd_package with ? too
In order to avoid causing build failure, I add "typedef struct xen_psd_package 
xen_psd_package_t;"
I'm not familiar with the compat framework, if it isn't the right way to fix, 
plz let me know

> Also when laying out struct xen_processor_cppc, please avoid unnecessary gaps
> or tail padding - it looks like "shared_type" would better move up. I think 
> it would
> also be a good idea to make padding fields explicit, and check them to be 
> zero. This
> way they can be assigned meaning later (if need
> be) without breaking backwards compatibility.
>
> Jan


Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data

2025-05-12 Thread Jan Beulich
On 06.05.2025 11:11, Penny, Zheng wrote:
>> -Original Message-
>> From: Jan Beulich 
>> Sent: Monday, April 28, 2025 11:57 PM
>>
>> On 14.04.2025 09:40, Penny Zheng wrote:
>>> +if ( cppc_data->flags & XEN_CPPC_CPC )
>>> +{
>>> +if ( cppc_data->cpc.highest_perf == 0 ||
>>> + cppc_data->cpc.highest_perf > UINT8_MAX ||
>>> + cppc_data->cpc.nominal_perf == 0 ||
>>> + cppc_data->cpc.nominal_perf > UINT8_MAX ||
>>> + cppc_data->cpc.lowest_nonlinear_perf == 0 ||
>>> + cppc_data->cpc.lowest_nonlinear_perf > UINT8_MAX ||
>>> + cppc_data->cpc.lowest_perf == 0 ||
>>> + cppc_data->cpc.lowest_perf > UINT8_MAX ||
>>> + cppc_data->cpc.lowest_perf >
>>> +cppc_data->cpc.lowest_nonlinear_perf ||
>>
>> Where's this ordering spelled out in the spec?
>>
> 
> Clip a snippet from description on lowest performance[1], we may deduce that
> ```
> Selecting a performance level lower than the lowest nonlinear performance 
> level may actually cause an efficiency penalty,
> but should reduce the instantaneous power consumption of the processor
> ```
> lowest is smaller than lowest nonlinear

I can't imply that from the quoted sentence. It describes what happens in that
situation, but it doesn't exclude the opposite relationship (in which case the
described situation simply can't occur).

>>> + cppc_data->cpc.lowest_nonlinear_perf >
>>> +cppc_data->cpc.nominal_perf ||
>>> + cppc_data->cpc.nominal_perf > cppc_data->cpc.highest_perf )
>>> +/*
>>> + * Right now, Xen doesn't actually use perf values
>>> + * in ACPI _CPC table, warning is enough.
>>> + */
>>> +printk(XENLOG_WARNING
>>> +   "Broken CPPC perf values: lowest(%u), 
>>> nonlinear_lowest(%u),
>> nominal(%u), highest(%u)\n",
>>> +   cppc_data->cpc.lowest_perf,
>>> +   cppc_data->cpc.lowest_nonlinear_perf,
>>> +   cppc_data->cpc.nominal_perf,
>>> +   cppc_data->cpc.highest_perf);
>>
>> If this warning was to ever surface, it would likely surface for every CPU.
>> That's unnecessarily verbose, I guess. Please consider using printk_once() 
>> here.
>>
> 
> Understood
> 
>> Also, is "right now" (as the comment says) still going to be true by the end 
>> of the
>> series? Didn't I see you use the values in earlier versions?
>>
> 
> The reason why I added this comment is that in current implementation, we 
> actually
> don't use values read from ACPI _CPC table for lowest_perf, 
> lowest_nonlinear_perf,
> nominal_perf, and highest_perf.
> We read CPPC capability MSR to get these four values.

Oh, okay. Could you slightly extend that comment to include this detail?

>>> +if ( cppc_data->flags == (XEN_CPPC_PSD | XEN_CPPC_CPC) )
>>
>> If either flag may be clear, ...
>>
>>> +{
>>> +pm_info->cppc_data = *cppc_data;
>>> +if ( cpufreq_verbose )
>>> +{
>>> +print_PSD(&pm_info->cppc_data.domain_info);
>>> +print_CPPC(&pm_info->cppc_data);
>>
>> ... why unconditionally loog both?
>>
>>> +}
>>> +
>>> +pm_info->init = XEN_CPPC_INIT;
>>
>> Plus is it correct to set this flag if either of the incoming flags was 
>> clear?
> 
> Hmm, I may not understand what you mean here...
> I log and set this flag only when both flags are set (cppc_data->flags == 
> (XEN_CPPC_PSD | XEN_CPPC_CPC))
> _PSD entry and _CPC entry are both mandatory
> Did you suggest that we shall give warning message when either flag is clear?

Oh, sorry - I read & where you have == actually. Hence why I thought only
one of the flags may be set. Please disregard those comments of mine.

Jan



RE: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data

2025-05-06 Thread Penny, Zheng
[Public]

Hi,

> -Original Message-
> From: Jan Beulich 
> Sent: Monday, April 28, 2025 11:57 PM
> To: Penny, Zheng 
> Cc: Huang, Ray ; Andrew Cooper
> ; Roger Pau Monné ;
> Anthony PERARD ; Orzel, Michal
> ; Julien Grall ; Stefano Stabellini
> ; xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to 
> propagate
> CPPC data
>
> On 14.04.2025 09:40, Penny Zheng wrote:
> > @@ -459,6 +464,26 @@ struct xen_processor_performance {  typedef
> > struct xen_processor_performance xen_processor_performance_t;
> > DEFINE_XEN_GUEST_HANDLE(xen_processor_performance_t);
> >
> > +struct xen_processor_cppc {
> > +uint8_t flags; /* flag for CPPC sub info type */
> > +/*
> > + * Subset _CPC fields useful for CPPC-compatible cpufreq
> > + * driver's initialization
> > + */
> > +struct {
> > +uint32_t highest_perf;
> > +uint32_t nominal_perf;
> > +uint32_t lowest_nonlinear_perf;
> > +uint32_t lowest_perf;
> > +uint32_t lowest_mhz;
> > +uint32_t nominal_mhz;
> > +} cpc;
> > +struct xen_psd_package domain_info; /* _PSD */
>
> This being a member of the new type, ...
>
> > --- a/xen/include/xlat.lst
> > +++ b/xen/include/xlat.lst
> > @@ -168,6 +168,7 @@
> >  !  processor_performance   platform.h
> >  !  processor_power platform.h
> >  ?  processor_pxplatform.h
> > +?  processor_cppc  platform.h
>
> ... how can it be ? here when it's ...
>
> >  !  psd_package platform.h
>
> ... ! here? And with it being ?, you're lacking a place where you invoke the 
> resulting
> checking macro (which I assume would cause a build failure).
>
> Also when laying out struct xen_processor_cppc, please avoid unnecessary gaps
> or tail padding - it looks like "shared_type" would better move up. I think 
> it would
> also be a good idea to make padding fields explicit, and check them to be 
> zero.
> This way they can be assigned meaning later (if need
> be) without breaking backwards compatibility.
>

Understood, I'll re-construct into increasing order and add explicit padding:
```
struct xen_processor_cppc {
uint8_t flags; /* flag for CPPC sub info type */
uint8_t pad[3]; /* padding and must be zero */
/*
 * Subset _CPC fields useful for CPPC-compatible cpufreq
 * driver's initialization
 */
struct {
uint32_t highest_perf;
uint32_t nominal_perf;
uint32_t lowest_nonlinear_perf;
uint32_t lowest_perf;
uint32_t lowest_mhz;
uint32_t nominal_mhz;
} cpc;
/* Coordination type of this processor */
uint32_t shared_type;
struct xen_psd_package domain_info; /* _PSD */
};
```
> Jan


RE: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data

2025-05-06 Thread Penny, Zheng
[Public]

HI

> -Original Message-
> From: Jan Beulich 
> Sent: Monday, April 28, 2025 11:57 PM
> To: Penny, Zheng 
> Cc: Huang, Ray ; Andrew Cooper
> ; Roger Pau Monné ;
> Anthony PERARD ; Orzel, Michal
> ; Julien Grall ; Stefano Stabellini
> ; xen-devel@lists.xenproject.org
> Subject: Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to 
> propagate
> CPPC data
>
> On 14.04.2025 09:40, Penny Zheng wrote:
> > +if ( cppc_data->flags & XEN_CPPC_CPC )
> > +{
> > +if ( cppc_data->cpc.highest_perf == 0 ||
> > + cppc_data->cpc.highest_perf > UINT8_MAX ||
> > + cppc_data->cpc.nominal_perf == 0 ||
> > + cppc_data->cpc.nominal_perf > UINT8_MAX ||
> > + cppc_data->cpc.lowest_nonlinear_perf == 0 ||
> > + cppc_data->cpc.lowest_nonlinear_perf > UINT8_MAX ||
> > + cppc_data->cpc.lowest_perf == 0 ||
> > + cppc_data->cpc.lowest_perf > UINT8_MAX ||
> > + cppc_data->cpc.lowest_perf >
> > +cppc_data->cpc.lowest_nonlinear_perf ||
>
> Where's this ordering spelled out in the spec?
>

Clip a snippet from description on lowest performance[1], we may deduce that
```
Selecting a performance level lower than the lowest nonlinear performance level 
may actually cause an efficiency penalty,
but should reduce the instantaneous power consumption of the processor
```
lowest is smaller than lowest nonlinear

> > + cppc_data->cpc.lowest_nonlinear_perf >
> > +cppc_data->cpc.nominal_perf ||
> > + cppc_data->cpc.nominal_perf > cppc_data->cpc.highest_perf )
> > +/*
> > + * Right now, Xen doesn't actually use perf values
> > + * in ACPI _CPC table, warning is enough.
> > + */
> > +printk(XENLOG_WARNING
> > +   "Broken CPPC perf values: lowest(%u), 
> > nonlinear_lowest(%u),
> nominal(%u), highest(%u)\n",
> > +   cppc_data->cpc.lowest_perf,
> > +   cppc_data->cpc.lowest_nonlinear_perf,
> > +   cppc_data->cpc.nominal_perf,
> > +   cppc_data->cpc.highest_perf);
>
> If this warning was to ever surface, it would likely surface for every CPU.
> That's unnecessarily verbose, I guess. Please consider using printk_once() 
> here.
>

Understood

> Also, is "right now" (as the comment says) still going to be true by the end 
> of the
> series? Didn't I see you use the values in earlier versions?
>

The reason why I added this comment is that in current implementation, we 
actually
don't use values read from ACPI _CPC table for lowest_perf, 
lowest_nonlinear_perf,
nominal_perf, and highest_perf.
We read CPPC capability MSR to get these four values.
What we actually use is the following optional lowest_mhz and nominal_mhz. We 
read
these two values for perf_to_khz transition.

There are two ways to read CPPC capability info, one is from MSR register, and 
the other is from
_CPC table. Only in very few hardware, MSR is not supported, and we must switch 
to use ACPI _CPC
Such scenario is not covered in this patch serie. I've mentioned it in the 
cover letter.
The difficulty actually is that when we try to use ACPI _CPC to do CPPC 
performance monitoring,
some control registers are probably implemented in the Platform Communications 
Channel (PCC) interface, which
is not supported in Xen.

> > +
> > +if ( cppc_data->flags == (XEN_CPPC_PSD | XEN_CPPC_CPC) )
>
> If either flag may be clear, ...
>
> > +{
> > +pm_info->cppc_data = *cppc_data;
> > +if ( cpufreq_verbose )
> > +{
> > +print_PSD(&pm_info->cppc_data.domain_info);
> > +print_CPPC(&pm_info->cppc_data);
>
> ... why unconditionally loog both?
>
> > +}
> > +
> > +pm_info->init = XEN_CPPC_INIT;
>
> Plus is it correct to set this flag if either of the incoming flags was clear?
>

Hmm, I may not understand what you mean here...
I log and set this flag only when both flags are set (cppc_data->flags == 
(XEN_CPPC_PSD | XEN_CPPC_CPC))
_PSD entry and _CPC entry are both mandatory
Did you suggest that we shall give warning message when either flag is clear?

> Jan

[1] 
https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html?highlight=cppc#lowest-performance


Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data

2025-04-28 Thread Jan Beulich
On 14.04.2025 09:40, Penny Zheng wrote:
> In order to provide backward compatibility with existing governors
> that represent performance as frequency, like ondemand, the _CPC
> table can optionally provide processor frequency range values, Lowest
> frequency and Norminal frequency, to let OS use Lowest Frequency/

Nit: Nominal

> @@ -497,12 +504,19 @@ static void print_PPC(unsigned int platform_limit)
>  printk("\t_PPC: %d\n", platform_limit);
>  }
>  
> -static int check_psd_pminfo(const struct xen_processor_performance *perf)
> +static int check_psd_pminfo(const struct xen_processor_performance *perf,
> +const struct xen_processor_cppc *cppc_data)
>  {
> +uint32_t shared_type;
> +
> +if ( !perf && !cppc_data )
> +return -EINVAL;
> +
> +shared_type = perf ? perf->shared_type : cppc_data->shared_type;

Why don't you have the caller pass in shared_type? The two pointers aren't
used ...

>  /* check domain coordination */
> -if ( perf->shared_type != CPUFREQ_SHARED_TYPE_ALL &&
> - perf->shared_type != CPUFREQ_SHARED_TYPE_ANY &&
> - perf->shared_type != CPUFREQ_SHARED_TYPE_HW )
> +if ( shared_type != CPUFREQ_SHARED_TYPE_ALL &&
> + shared_type != CPUFREQ_SHARED_TYPE_ANY &&
> + shared_type != CPUFREQ_SHARED_TYPE_HW )
>  return -EINVAL;
>  
>  return 0;

... for anything else.

> @@ -627,6 +641,109 @@ out:
>  return ret;
>  }
>  
> +static void print_CPPC(const struct xen_processor_cppc *cppc_data)
> +{
> +printk("\t_CPC: highest_perf=%u, lowest_perf=%u, "
> +   "nominal_perf=%u, lowest_nonlinear_perf=%u, "
> +   "nominal_mhz=%uMHz, lowest_mhz=%uMHz\n",
> +   cppc_data->cpc.highest_perf, cppc_data->cpc.lowest_perf,
> +   cppc_data->cpc.nominal_perf, cppc_data->cpc.lowest_nonlinear_perf,
> +   cppc_data->cpc.nominal_mhz, cppc_data->cpc.lowest_mhz);
> +}
> +
> +int set_cppc_pminfo(unsigned int acpi_id,
> +const struct xen_processor_cppc *cppc_data)
> +{
> +int ret = 0, cpuid;
> +struct processor_pminfo *pm_info;
> +
> +cpuid = get_cpu_id(acpi_id);
> +if ( cpuid < 0 || !cppc_data )
> +{
> +ret = -EINVAL;
> +goto out;
> +}
> +if ( cpufreq_verbose )
> +printk("Set CPU acpi_id(%u) cpuid(%d) CPPC State info:\n",
> +   acpi_id, cpuid);
> +
> +pm_info = processor_pminfo[cpuid];
> +if ( !pm_info )
> +{
> +pm_info = xvzalloc(struct processor_pminfo);
> +if ( !pm_info )
> +{
> +ret = -ENOMEM;
> +goto out;
> +}
> +processor_pminfo[cpuid] = pm_info;
> +}
> +pm_info->acpi_id = acpi_id;
> +pm_info->id = cpuid;
> +
> +if ( cppc_data->flags & XEN_CPPC_PSD )
> +{
> +ret = check_psd_pminfo(NULL, cppc_data);
> +if ( ret )
> +goto out;
> +}
> +
> +if ( cppc_data->flags & XEN_CPPC_CPC )
> +{
> +if ( cppc_data->cpc.highest_perf == 0 ||
> + cppc_data->cpc.highest_perf > UINT8_MAX ||
> + cppc_data->cpc.nominal_perf == 0 ||
> + cppc_data->cpc.nominal_perf > UINT8_MAX ||
> + cppc_data->cpc.lowest_nonlinear_perf == 0 ||
> + cppc_data->cpc.lowest_nonlinear_perf > UINT8_MAX ||
> + cppc_data->cpc.lowest_perf == 0 ||
> + cppc_data->cpc.lowest_perf > UINT8_MAX ||
> + cppc_data->cpc.lowest_perf >
> +cppc_data->cpc.lowest_nonlinear_perf ||

Where's this ordering spelled out in the spec?

> + cppc_data->cpc.lowest_nonlinear_perf >
> +cppc_data->cpc.nominal_perf ||
> + cppc_data->cpc.nominal_perf > cppc_data->cpc.highest_perf )
> +/*
> + * Right now, Xen doesn't actually use perf values
> + * in ACPI _CPC table, warning is enough.
> + */
> +printk(XENLOG_WARNING
> +   "Broken CPPC perf values: lowest(%u), 
> nonlinear_lowest(%u), nominal(%u), highest(%u)\n",
> +   cppc_data->cpc.lowest_perf,
> +   cppc_data->cpc.lowest_nonlinear_perf,
> +   cppc_data->cpc.nominal_perf,
> +   cppc_data->cpc.highest_perf);

If this warning was to ever surface, it would likely surface for every CPU.
That's unnecessarily verbose, I guess. Please consider using printk_once()
here.

Also, is "right now" (as the comment says) still going to be true by the
end of the series? Didn't I see you use the values in earlier versions?

> +/* lowest_mhz and nominal_mhz are optional value */
> +if ( (cppc_data->cpc.lowest_mhz && cppc_data->cpc.nominal_mhz) &&
> + cppc_data->cpc.lowest_mhz > cppc_data->cpc.nominal_mhz )

The 1st of the three checks is redundant with the 3rd one. There's also no
point parenthesizing one && against another.

> +printk(XENLOG_WARNING
> +