On Thu, Aug 28, 2025, Zihuan Zhang wrote:
> > Hmm, this is technically buggy.  __free() won't invoke put_cpufreq_policy() 
> > until
> > policy goes out of scope, and so using __free() means the code is 
> > effectively:
> > 
> >             if (IS_ENABLED(CONFIG_CPU_FREQ)) {
> >                     struct cpufreq_policy *policy;
> >                     int cpu;
> > 
> >                     cpu = get_cpu();
> >                     policy = cpufreq_cpu_get(cpu);
> >                     if (policy && policy->cpuinfo.max_freq)
> >                             max_tsc_khz = policy->cpuinfo.max_freq;
> >                     put_cpu();
> > 
> >                     if (policy)
> >                             cpufreq_cpu_put(policy);
> >             }

...

> Yes, this will indeed change the execution order.
> Can you accept that? 

No, because it's buggy.

> Personally, I don’t think it’s ideal either.
> 
>               if (IS_ENABLED(CONFIG_CPU_FREQ)) {
>                       int cpu;
>                       cpu = get_cpu();
>                       {
>                               struct cpufreq_policy *policy 
> __free(put_cpufreq_policy) = cpufreq_cpu_get(cpu);
>                               if (policy && policy->cpuinfo.max_freq)
>                                       max_tsc_khz = policy->cpuinfo.max_freq;
>                       }
>                       put_cpu();
> 
>                                             }
> 
> Other places may also have the same issue,
> 
> maybe we should consider introducing a macro to handle this properly,
> so that initialization and cleanup are well defined without changing
> the existing order unexpected.
> 
> like this:
> 
> #define WITH_CPUFREQ_POLICY(cpu) {\
> 
> for(struct cpufreq_policy *policy __free(put_cpufreq_policy) =  \
>                       cpufreq_cpu_get(cpu);                   \
>                       policy;)
> 
> Then Use it:
> 
>               if (IS_ENABLED(CONFIG_CPU_FREQ)) {
>                       int cpu;
>                       cpu = get_cpu();
>                       WITH_CPUFREQ_POLICY(cpu){
>                               if (policy->cpuinfo.max_freq)
>                                       max_tsc_khz = policy->cpuinfo.max_freq;
>                       }
>                       put_cpu();

This all feels very forced, in the sense that we have a shiny new tool and are
trying to use it everywhere without thinking critically about whether or not
doing so is actually an improvement.

At a glance, this is literally the only instance in the entire kernel where the
CPU to use is grabbed immediately before the policy.
 
  $ git grep -B 20 cpufreq_cpu_get | grep -e get_cpu -e smp_processor_id
  arch/x86/kvm/x86.c-                   cpu = get_cpu();
  drivers/cpufreq/cppc_cpufreq.c-static int cppc_get_cpu_power(struct device 
*cpu_dev,
  drivers/cpufreq/cppc_cpufreq.c-static int cppc_get_cpu_cost(struct device 
*cpu_dev, unsigned long KHz,
  drivers/cpufreq/mediatek-cpufreq-hw.c-mtk_cpufreq_get_cpu_power(struct device 
*cpu_dev, unsigned long *uW,

Probably because KVM's usage is rather bizarre and honestly kind of dumb.  But
KVM has had this behavior for 15+ years, so as weird as it is, I'm not inclined
to change it without a really, really strong reason to do so, e.g. to iterate
over all CPUs or something.

So given that this is the only intance of the problem patter, I think it makes
sense to leave KVM as-is, and not spend a bunch of time trying to figure out how
to make KVM's usage play nice with __free().

Reply via email to