On Thu, 03 Mar 2022 05:49:00 +0000,
Eugene Huang <euge...@nvidia.com> wrote:
> 
> <EH> We have the following further 1-to-1 mappings:
> pcpu-20 - vcpu-0 is running your timer test, everything is fine
> pcpu-21 - vcpu-1 starts some other workload, and this affects the timer test
> on the other vcpu
> 
> - Each vCPU thread is pinned to its individual pCPU on the host (vcpupin in 
> libvirt).
> - Each pCPU on which a vCPU thread runs is isolated on the host (isolcpus).
> - Each vCPU that runs the workload is isolated in the guest VM (isolcpus).
> 
> So we are pretty sure the workloads are separated.

Hmmm. Isolcpus certainly is something I never use. You may want to
check whether this has an influence on what your test's behaviour. You
may also want to post your full libvirt config, just in case someone
spots an issue there (I won't, as I know next to nothing about
libvirt).

> 
> > 
> > Also, please work out whether you exit because of a blocking WFI or WFE, as
> > they are indicative of different guest behaviour.
> 
> <EH> Will do. Somehow our current trace does not show this information.
> 
> > 
> > > Since we pin that workload to its own vCPU, in theory, it should not
> > > affect the timing of another vCPU.
> > 
> > Why not? a vcpu is just a host thread, and if they share a physical CPU at
> > some point, there is a knock-on effect.
> 
> <EH> Again, because of vcpupin in libvirt, there is no sharing of a
> pCPU among vCPUs. At least that is our configuration intention.

Which may or may not be what happens in reality. libvirt is largely
opaque, and because you ask it to do something doesn't mean it happens
the way you hope it does.

> 
> > 
> > > > You also don't mention what host kernel version you are running.
> > > > In general, please try and reproduce the issue using the latest
> > > > kernel version
> > > > (5.16 at the moment). Please also indicate what HW you are using.
> > >
> > > <EH> Tried 5.15 and 5.4 kernels. Both have the issue. Do you think
> > > 5.16 can make a difference? The HW is an Ampere Altra system.
> > 
> > Unlikely. The Altra is a mostly sane system, as long as you make sure that
> > VMs don't migrate across sockets (at which point it becomes laughably bad).
> > Nothing to do with KVM though.
> 
> <EH> Right, there is no migration of VMs.
> I see kvm arm timer related code is very different between 5.4 and
> 5.15/5.16.  Can we still use 5.4 for both the host and the guest?

That's your call. I've stopped looking at 5.4 a couple of minutes
after it was released. If I'm going to look for something, that will
be on top of upstream.

        M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Reply via email to