Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data
On 16.05.2025 12:35, Penny, Zheng wrote: >> -Original Message- >> From: Jan Beulich >> Sent: Monday, April 28, 2025 11:57 PM >> >> On 14.04.2025 09:40, Penny Zheng wrote: >>> @@ -459,6 +464,26 @@ struct xen_processor_performance { typedef >>> struct xen_processor_performance xen_processor_performance_t; >>> DEFINE_XEN_GUEST_HANDLE(xen_processor_performance_t); >>> >>> +struct xen_processor_cppc { >>> +uint8_t flags; /* flag for CPPC sub info type */ >>> +/* >>> + * Subset _CPC fields useful for CPPC-compatible cpufreq >>> + * driver's initialization >>> + */ >>> +struct { >>> +uint32_t highest_perf; >>> +uint32_t nominal_perf; >>> +uint32_t lowest_nonlinear_perf; >>> +uint32_t lowest_perf; >>> +uint32_t lowest_mhz; >>> +uint32_t nominal_mhz; >>> +} cpc; >>> +struct xen_psd_package domain_info; /* _PSD */ >> >> This being a member of the new type, ... >> >>> --- a/xen/include/xlat.lst >>> +++ b/xen/include/xlat.lst >>> @@ -168,6 +168,7 @@ >>> ! processor_performance platform.h >>> ! processor_power platform.h >>> ? processor_pxplatform.h >>> +? processor_cppc platform.h >> >> ... how can it be ? here when it's ... >> >>> ! psd_package platform.h >> >> ... ! here? And with it being ?, you're lacking a place where you invoke the >> resulting >> checking macro (which I assume would cause a build failure). I guess this wasn't clear enough then: Aiui you cannot "check" these, because the native and compat ones are going to be different. You'll need to use ! here, and then use the respective XLAT_* macro(s). IOW ... > Understood, I see it automatically generates CHECK_psd_package. I shall > change psd_package with ? too > In order to avoid causing build failure, I add "typedef struct > xen_psd_package xen_psd_package_t;" > I'm not familiar with the compat framework, if it isn't the right way to fix, > plz let me know ... I expect this isn't the correct way of dealing with it. Jan
RE: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data
[Public] Hi > -Original Message- > From: Jan Beulich > Sent: Monday, April 28, 2025 11:57 PM > To: Penny, Zheng > Cc: Huang, Ray ; Andrew Cooper > ; Roger Pau Monné ; > Anthony PERARD ; Orzel, Michal > ; Julien Grall ; Stefano Stabellini > ; xen-devel@lists.xenproject.org > Subject: Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to > propagate > CPPC data > > On 14.04.2025 09:40, Penny Zheng wrote: > > @@ -459,6 +464,26 @@ struct xen_processor_performance { typedef > > struct xen_processor_performance xen_processor_performance_t; > > DEFINE_XEN_GUEST_HANDLE(xen_processor_performance_t); > > > > +struct xen_processor_cppc { > > +uint8_t flags; /* flag for CPPC sub info type */ > > +/* > > + * Subset _CPC fields useful for CPPC-compatible cpufreq > > + * driver's initialization > > + */ > > +struct { > > +uint32_t highest_perf; > > +uint32_t nominal_perf; > > +uint32_t lowest_nonlinear_perf; > > +uint32_t lowest_perf; > > +uint32_t lowest_mhz; > > +uint32_t nominal_mhz; > > +} cpc; > > +struct xen_psd_package domain_info; /* _PSD */ > > This being a member of the new type, ... > > > --- a/xen/include/xlat.lst > > +++ b/xen/include/xlat.lst > > @@ -168,6 +168,7 @@ > > ! processor_performance platform.h > > ! processor_power platform.h > > ? processor_pxplatform.h > > +? processor_cppc platform.h > > ... how can it be ? here when it's ... > > > ! psd_package platform.h > > ... ! here? And with it being ?, you're lacking a place where you invoke the > resulting > checking macro (which I assume would cause a build failure). > Understood, I see it automatically generates CHECK_psd_package. I shall change psd_package with ? too In order to avoid causing build failure, I add "typedef struct xen_psd_package xen_psd_package_t;" I'm not familiar with the compat framework, if it isn't the right way to fix, plz let me know > Also when laying out struct xen_processor_cppc, please avoid unnecessary gaps > or tail padding - it looks like "shared_type" would better move up. I think > it would > also be a good idea to make padding fields explicit, and check them to be > zero. This > way they can be assigned meaning later (if need > be) without breaking backwards compatibility. > > Jan
Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data
On 06.05.2025 11:11, Penny, Zheng wrote: >> -Original Message- >> From: Jan Beulich >> Sent: Monday, April 28, 2025 11:57 PM >> >> On 14.04.2025 09:40, Penny Zheng wrote: >>> +if ( cppc_data->flags & XEN_CPPC_CPC ) >>> +{ >>> +if ( cppc_data->cpc.highest_perf == 0 || >>> + cppc_data->cpc.highest_perf > UINT8_MAX || >>> + cppc_data->cpc.nominal_perf == 0 || >>> + cppc_data->cpc.nominal_perf > UINT8_MAX || >>> + cppc_data->cpc.lowest_nonlinear_perf == 0 || >>> + cppc_data->cpc.lowest_nonlinear_perf > UINT8_MAX || >>> + cppc_data->cpc.lowest_perf == 0 || >>> + cppc_data->cpc.lowest_perf > UINT8_MAX || >>> + cppc_data->cpc.lowest_perf > >>> +cppc_data->cpc.lowest_nonlinear_perf || >> >> Where's this ordering spelled out in the spec? >> > > Clip a snippet from description on lowest performance[1], we may deduce that > ``` > Selecting a performance level lower than the lowest nonlinear performance > level may actually cause an efficiency penalty, > but should reduce the instantaneous power consumption of the processor > ``` > lowest is smaller than lowest nonlinear I can't imply that from the quoted sentence. It describes what happens in that situation, but it doesn't exclude the opposite relationship (in which case the described situation simply can't occur). >>> + cppc_data->cpc.lowest_nonlinear_perf > >>> +cppc_data->cpc.nominal_perf || >>> + cppc_data->cpc.nominal_perf > cppc_data->cpc.highest_perf ) >>> +/* >>> + * Right now, Xen doesn't actually use perf values >>> + * in ACPI _CPC table, warning is enough. >>> + */ >>> +printk(XENLOG_WARNING >>> + "Broken CPPC perf values: lowest(%u), >>> nonlinear_lowest(%u), >> nominal(%u), highest(%u)\n", >>> + cppc_data->cpc.lowest_perf, >>> + cppc_data->cpc.lowest_nonlinear_perf, >>> + cppc_data->cpc.nominal_perf, >>> + cppc_data->cpc.highest_perf); >> >> If this warning was to ever surface, it would likely surface for every CPU. >> That's unnecessarily verbose, I guess. Please consider using printk_once() >> here. >> > > Understood > >> Also, is "right now" (as the comment says) still going to be true by the end >> of the >> series? Didn't I see you use the values in earlier versions? >> > > The reason why I added this comment is that in current implementation, we > actually > don't use values read from ACPI _CPC table for lowest_perf, > lowest_nonlinear_perf, > nominal_perf, and highest_perf. > We read CPPC capability MSR to get these four values. Oh, okay. Could you slightly extend that comment to include this detail? >>> +if ( cppc_data->flags == (XEN_CPPC_PSD | XEN_CPPC_CPC) ) >> >> If either flag may be clear, ... >> >>> +{ >>> +pm_info->cppc_data = *cppc_data; >>> +if ( cpufreq_verbose ) >>> +{ >>> +print_PSD(&pm_info->cppc_data.domain_info); >>> +print_CPPC(&pm_info->cppc_data); >> >> ... why unconditionally loog both? >> >>> +} >>> + >>> +pm_info->init = XEN_CPPC_INIT; >> >> Plus is it correct to set this flag if either of the incoming flags was >> clear? > > Hmm, I may not understand what you mean here... > I log and set this flag only when both flags are set (cppc_data->flags == > (XEN_CPPC_PSD | XEN_CPPC_CPC)) > _PSD entry and _CPC entry are both mandatory > Did you suggest that we shall give warning message when either flag is clear? Oh, sorry - I read & where you have == actually. Hence why I thought only one of the flags may be set. Please disregard those comments of mine. Jan
RE: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data
[Public] Hi, > -Original Message- > From: Jan Beulich > Sent: Monday, April 28, 2025 11:57 PM > To: Penny, Zheng > Cc: Huang, Ray ; Andrew Cooper > ; Roger Pau Monné ; > Anthony PERARD ; Orzel, Michal > ; Julien Grall ; Stefano Stabellini > ; xen-devel@lists.xenproject.org > Subject: Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to > propagate > CPPC data > > On 14.04.2025 09:40, Penny Zheng wrote: > > @@ -459,6 +464,26 @@ struct xen_processor_performance { typedef > > struct xen_processor_performance xen_processor_performance_t; > > DEFINE_XEN_GUEST_HANDLE(xen_processor_performance_t); > > > > +struct xen_processor_cppc { > > +uint8_t flags; /* flag for CPPC sub info type */ > > +/* > > + * Subset _CPC fields useful for CPPC-compatible cpufreq > > + * driver's initialization > > + */ > > +struct { > > +uint32_t highest_perf; > > +uint32_t nominal_perf; > > +uint32_t lowest_nonlinear_perf; > > +uint32_t lowest_perf; > > +uint32_t lowest_mhz; > > +uint32_t nominal_mhz; > > +} cpc; > > +struct xen_psd_package domain_info; /* _PSD */ > > This being a member of the new type, ... > > > --- a/xen/include/xlat.lst > > +++ b/xen/include/xlat.lst > > @@ -168,6 +168,7 @@ > > ! processor_performance platform.h > > ! processor_power platform.h > > ? processor_pxplatform.h > > +? processor_cppc platform.h > > ... how can it be ? here when it's ... > > > ! psd_package platform.h > > ... ! here? And with it being ?, you're lacking a place where you invoke the > resulting > checking macro (which I assume would cause a build failure). > > Also when laying out struct xen_processor_cppc, please avoid unnecessary gaps > or tail padding - it looks like "shared_type" would better move up. I think > it would > also be a good idea to make padding fields explicit, and check them to be > zero. > This way they can be assigned meaning later (if need > be) without breaking backwards compatibility. > Understood, I'll re-construct into increasing order and add explicit padding: ``` struct xen_processor_cppc { uint8_t flags; /* flag for CPPC sub info type */ uint8_t pad[3]; /* padding and must be zero */ /* * Subset _CPC fields useful for CPPC-compatible cpufreq * driver's initialization */ struct { uint32_t highest_perf; uint32_t nominal_perf; uint32_t lowest_nonlinear_perf; uint32_t lowest_perf; uint32_t lowest_mhz; uint32_t nominal_mhz; } cpc; /* Coordination type of this processor */ uint32_t shared_type; struct xen_psd_package domain_info; /* _PSD */ }; ``` > Jan
RE: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data
[Public] HI > -Original Message- > From: Jan Beulich > Sent: Monday, April 28, 2025 11:57 PM > To: Penny, Zheng > Cc: Huang, Ray ; Andrew Cooper > ; Roger Pau Monné ; > Anthony PERARD ; Orzel, Michal > ; Julien Grall ; Stefano Stabellini > ; xen-devel@lists.xenproject.org > Subject: Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to > propagate > CPPC data > > On 14.04.2025 09:40, Penny Zheng wrote: > > +if ( cppc_data->flags & XEN_CPPC_CPC ) > > +{ > > +if ( cppc_data->cpc.highest_perf == 0 || > > + cppc_data->cpc.highest_perf > UINT8_MAX || > > + cppc_data->cpc.nominal_perf == 0 || > > + cppc_data->cpc.nominal_perf > UINT8_MAX || > > + cppc_data->cpc.lowest_nonlinear_perf == 0 || > > + cppc_data->cpc.lowest_nonlinear_perf > UINT8_MAX || > > + cppc_data->cpc.lowest_perf == 0 || > > + cppc_data->cpc.lowest_perf > UINT8_MAX || > > + cppc_data->cpc.lowest_perf > > > +cppc_data->cpc.lowest_nonlinear_perf || > > Where's this ordering spelled out in the spec? > Clip a snippet from description on lowest performance[1], we may deduce that ``` Selecting a performance level lower than the lowest nonlinear performance level may actually cause an efficiency penalty, but should reduce the instantaneous power consumption of the processor ``` lowest is smaller than lowest nonlinear > > + cppc_data->cpc.lowest_nonlinear_perf > > > +cppc_data->cpc.nominal_perf || > > + cppc_data->cpc.nominal_perf > cppc_data->cpc.highest_perf ) > > +/* > > + * Right now, Xen doesn't actually use perf values > > + * in ACPI _CPC table, warning is enough. > > + */ > > +printk(XENLOG_WARNING > > + "Broken CPPC perf values: lowest(%u), > > nonlinear_lowest(%u), > nominal(%u), highest(%u)\n", > > + cppc_data->cpc.lowest_perf, > > + cppc_data->cpc.lowest_nonlinear_perf, > > + cppc_data->cpc.nominal_perf, > > + cppc_data->cpc.highest_perf); > > If this warning was to ever surface, it would likely surface for every CPU. > That's unnecessarily verbose, I guess. Please consider using printk_once() > here. > Understood > Also, is "right now" (as the comment says) still going to be true by the end > of the > series? Didn't I see you use the values in earlier versions? > The reason why I added this comment is that in current implementation, we actually don't use values read from ACPI _CPC table for lowest_perf, lowest_nonlinear_perf, nominal_perf, and highest_perf. We read CPPC capability MSR to get these four values. What we actually use is the following optional lowest_mhz and nominal_mhz. We read these two values for perf_to_khz transition. There are two ways to read CPPC capability info, one is from MSR register, and the other is from _CPC table. Only in very few hardware, MSR is not supported, and we must switch to use ACPI _CPC Such scenario is not covered in this patch serie. I've mentioned it in the cover letter. The difficulty actually is that when we try to use ACPI _CPC to do CPPC performance monitoring, some control registers are probably implemented in the Platform Communications Channel (PCC) interface, which is not supported in Xen. > > + > > +if ( cppc_data->flags == (XEN_CPPC_PSD | XEN_CPPC_CPC) ) > > If either flag may be clear, ... > > > +{ > > +pm_info->cppc_data = *cppc_data; > > +if ( cpufreq_verbose ) > > +{ > > +print_PSD(&pm_info->cppc_data.domain_info); > > +print_CPPC(&pm_info->cppc_data); > > ... why unconditionally loog both? > > > +} > > + > > +pm_info->init = XEN_CPPC_INIT; > > Plus is it correct to set this flag if either of the incoming flags was clear? > Hmm, I may not understand what you mean here... I log and set this flag only when both flags are set (cppc_data->flags == (XEN_CPPC_PSD | XEN_CPPC_CPC)) _PSD entry and _CPC entry are both mandatory Did you suggest that we shall give warning message when either flag is clear? > Jan [1] https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html?highlight=cppc#lowest-performance
Re: [PATCH v4 03/15] xen/x86: introduce new sub-hypercall to propagate CPPC data
On 14.04.2025 09:40, Penny Zheng wrote: > In order to provide backward compatibility with existing governors > that represent performance as frequency, like ondemand, the _CPC > table can optionally provide processor frequency range values, Lowest > frequency and Norminal frequency, to let OS use Lowest Frequency/ Nit: Nominal > @@ -497,12 +504,19 @@ static void print_PPC(unsigned int platform_limit) > printk("\t_PPC: %d\n", platform_limit); > } > > -static int check_psd_pminfo(const struct xen_processor_performance *perf) > +static int check_psd_pminfo(const struct xen_processor_performance *perf, > +const struct xen_processor_cppc *cppc_data) > { > +uint32_t shared_type; > + > +if ( !perf && !cppc_data ) > +return -EINVAL; > + > +shared_type = perf ? perf->shared_type : cppc_data->shared_type; Why don't you have the caller pass in shared_type? The two pointers aren't used ... > /* check domain coordination */ > -if ( perf->shared_type != CPUFREQ_SHARED_TYPE_ALL && > - perf->shared_type != CPUFREQ_SHARED_TYPE_ANY && > - perf->shared_type != CPUFREQ_SHARED_TYPE_HW ) > +if ( shared_type != CPUFREQ_SHARED_TYPE_ALL && > + shared_type != CPUFREQ_SHARED_TYPE_ANY && > + shared_type != CPUFREQ_SHARED_TYPE_HW ) > return -EINVAL; > > return 0; ... for anything else. > @@ -627,6 +641,109 @@ out: > return ret; > } > > +static void print_CPPC(const struct xen_processor_cppc *cppc_data) > +{ > +printk("\t_CPC: highest_perf=%u, lowest_perf=%u, " > + "nominal_perf=%u, lowest_nonlinear_perf=%u, " > + "nominal_mhz=%uMHz, lowest_mhz=%uMHz\n", > + cppc_data->cpc.highest_perf, cppc_data->cpc.lowest_perf, > + cppc_data->cpc.nominal_perf, cppc_data->cpc.lowest_nonlinear_perf, > + cppc_data->cpc.nominal_mhz, cppc_data->cpc.lowest_mhz); > +} > + > +int set_cppc_pminfo(unsigned int acpi_id, > +const struct xen_processor_cppc *cppc_data) > +{ > +int ret = 0, cpuid; > +struct processor_pminfo *pm_info; > + > +cpuid = get_cpu_id(acpi_id); > +if ( cpuid < 0 || !cppc_data ) > +{ > +ret = -EINVAL; > +goto out; > +} > +if ( cpufreq_verbose ) > +printk("Set CPU acpi_id(%u) cpuid(%d) CPPC State info:\n", > + acpi_id, cpuid); > + > +pm_info = processor_pminfo[cpuid]; > +if ( !pm_info ) > +{ > +pm_info = xvzalloc(struct processor_pminfo); > +if ( !pm_info ) > +{ > +ret = -ENOMEM; > +goto out; > +} > +processor_pminfo[cpuid] = pm_info; > +} > +pm_info->acpi_id = acpi_id; > +pm_info->id = cpuid; > + > +if ( cppc_data->flags & XEN_CPPC_PSD ) > +{ > +ret = check_psd_pminfo(NULL, cppc_data); > +if ( ret ) > +goto out; > +} > + > +if ( cppc_data->flags & XEN_CPPC_CPC ) > +{ > +if ( cppc_data->cpc.highest_perf == 0 || > + cppc_data->cpc.highest_perf > UINT8_MAX || > + cppc_data->cpc.nominal_perf == 0 || > + cppc_data->cpc.nominal_perf > UINT8_MAX || > + cppc_data->cpc.lowest_nonlinear_perf == 0 || > + cppc_data->cpc.lowest_nonlinear_perf > UINT8_MAX || > + cppc_data->cpc.lowest_perf == 0 || > + cppc_data->cpc.lowest_perf > UINT8_MAX || > + cppc_data->cpc.lowest_perf > > +cppc_data->cpc.lowest_nonlinear_perf || Where's this ordering spelled out in the spec? > + cppc_data->cpc.lowest_nonlinear_perf > > +cppc_data->cpc.nominal_perf || > + cppc_data->cpc.nominal_perf > cppc_data->cpc.highest_perf ) > +/* > + * Right now, Xen doesn't actually use perf values > + * in ACPI _CPC table, warning is enough. > + */ > +printk(XENLOG_WARNING > + "Broken CPPC perf values: lowest(%u), > nonlinear_lowest(%u), nominal(%u), highest(%u)\n", > + cppc_data->cpc.lowest_perf, > + cppc_data->cpc.lowest_nonlinear_perf, > + cppc_data->cpc.nominal_perf, > + cppc_data->cpc.highest_perf); If this warning was to ever surface, it would likely surface for every CPU. That's unnecessarily verbose, I guess. Please consider using printk_once() here. Also, is "right now" (as the comment says) still going to be true by the end of the series? Didn't I see you use the values in earlier versions? > +/* lowest_mhz and nominal_mhz are optional value */ > +if ( (cppc_data->cpc.lowest_mhz && cppc_data->cpc.nominal_mhz) && > + cppc_data->cpc.lowest_mhz > cppc_data->cpc.nominal_mhz ) The 1st of the three checks is redundant with the 3rd one. There's also no point parenthesizing one && against another. > +printk(XENLOG_WARNING > +