On Thu, Feb 26, 2026 at 11:47:54PM +0900, Akihiko Odaki wrote:
> On 2026/02/26 23:43, Akihiko Odaki wrote:
> > On 2026/02/26 20:54, Oliver Upton wrote:
> > > Hi Akihiko,
> > > 
> > > On Wed, Feb 25, 2026 at 01:31:15PM +0900, Akihiko Odaki wrote:
> > > > @@ -629,6 +629,8 @@ void kvm_arch_vcpu_load(struct kvm_vcpu
> > > > *vcpu, int cpu)
> > > >           kvm_vcpu_load_vhe(vcpu);
> > > >       kvm_arch_vcpu_load_fp(vcpu);
> > > >       kvm_vcpu_pmu_restore_guest(vcpu);
> > > > +    if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY,
> > > > &vcpu- >kvm->arch.flags))
> > > > +        kvm_make_request(KVM_REQ_CREATE_PMU, vcpu);
> > > 
> > > We only need to set the request if the vCPU has migrated to a different
> > > PMU implementation, no?
> > 
> > Indeed. I was too lazy to implement such a check since it won't affect
> > performance unless the new feature is requested, but having one may be
> > still nice.

I'd definitely like to see this.

> > > 
> > > >       if (kvm_arm_is_pvtime_enabled(&vcpu->arch))
> > > >           kvm_make_request(KVM_REQ_RECORD_STEAL, vcpu);
> > > > @@ -1056,6 +1058,9 @@ static int check_vcpu_requests(struct
> > > > kvm_vcpu *vcpu)
> > > >           if (kvm_check_request(KVM_REQ_RELOAD_PMU, vcpu))
> > > >               kvm_vcpu_reload_pmu(vcpu);
> > > > +        if (kvm_check_request(KVM_REQ_CREATE_PMU, vcpu))
> > > > +            kvm_vcpu_create_pmu(vcpu);
> > > > +
> > > 
> > > My strong preference would be to squash the migration handling into
> > > kvm_vcpu_reload_pmu(). It is already reprogramming PMU events in
> > > response to other things.
> > 
> > Can you share a reason for that?
> > 
> > In terms of complexity, I don't think it will help reducing complexity
> > since the only common things between kvm_vcpu_reload_pmu() and
> > kvm_vcpu_create_pmu() are the enumeration of enabled counters, which is
> > simple enough.

I prefer it in terms of code organization. We should have a single
helper that refreshes the backing perf events when something has
globally changed for the vPMU.

Besides this, "create" is confusing since the vPMU has already been
instantiated.

> > In terms of performance, I guess it is better to keep
> > kvm_vcpu_create_pmu() small since it is triggered for each migration.

I think the surrounding KVM code for iterating over the counters is
inconsequential compared to the overheads of calling into perf to
recreate the PMU events. Since we expect this to be slow, we should only
set the request when absolutely necessary.

> > > > +static bool kvm_pmu_counter_is_enabled(struct kvm_pmc *pmc)
> > > > +{
> > > > +    struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
> > > > +
> > > > +    return kvm_pmu_enabled_counter_mask(vcpu) & BIT(pmc->idx);
> > > >   }
> > > 
> > > You're churning a good bit of code, this needs to happen in a separate
> > > patch (if at all).
> > 
> > It makes sense. The next version will have a separate patch for this.

If I have the full picture right, you may not need it with a common
request handler.

> > > 
> > > > @@ -689,6 +710,14 @@ static void
> > > > kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
> > > >       int eventsel;
> > > >       u64 evtreg;
> > > > +    if (!arm_pmu) {
> > > > +        arm_pmu = kvm_pmu_probe_armpmu(vcpu->cpu);
> > > 
> > > kvm_pmu_probe_armpmu() takes a global mutex, I'm not sure that's what we
> > > want.
> > > 
> > > What prevents us from opening a PERF_TYPE_RAW event and allowing perf to
> > > work out the right PMU for this CPU?
> > 
> > Unfortunately perf does not seem to have a capability to switch to the
> > right PMU. tools/perf/Documentation/intel-hybrid.txt says the perf tool
> > creates events for each PMU in a hybird configuration, for example.
> 
> I think I misunderstood what you meant. Letting
> perf_event_create_kernel_counter() to figure out what a PMU to use may be a
> good idea. I'll give a try with the next version.

Yep, this is what I was alluding to.

> > 
> > > 
> > > > +        if (!arm_pmu) {
> > > > +            vcpu_set_on_unsupported_cpu(vcpu);
> > > 
> > > At this point it seems pretty late to flag the CPU as unsupported. Maybe
> > > instead we can compute the union cpumask for all the PMU implemetations
> > > the VM may schedule on.
> > 
> > This is just a safe guard and it is a responsibility of the userspace to
> > schedule the VCPU properly. It is conceptually same with what
> > kvm_arch_vcpu_load() does when migrating to an unsupported CPU.

I agree with you that we need to have some handling for this situation.

What I don't like about this is userspace doesn't discover its mistake
until the guest actually programs a PMC. I'd much rather preserve the
existing ABI where KVM proactively rejects running a vCPU on an
unsupported CPU.

> > > > +    case KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY:
> > > > +        lockdep_assert_held(&vcpu->kvm->arch.config_lock);
> > > > +        if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY,
> > > > &vcpu->kvm->arch.flags))
> > > > +            return 0;
> > > 
> > > We don't need a getter for this, userspace should remember how it
> > > provisioned the VM.
> > 
> > The getter is useful for debugging and testing. The selftest will use it
> > to query the current state.

That's fine for debugging this on your own kernel but we don't need it
upstream. There's several other vPMU attributes that are write-only,
like KVM_ARM_VCPU_PMU_V3_SET_PMU.

Thanks,
Oliver

Reply via email to