On Wed, Jun 04, 2014 at 01:07:21PM -0400, Gabriel L. Somlo wrote:
> On Wed, Jun 04, 2014 at 05:09:49PM +0200, Alexander Graf wrote:
> > >>>
> > >>>I grep-ed through the kvm sources for KVM_CAP for some inspiration,
> > >>>and it looks more like KVM_CAP_* is a way to tell userspace what the
> > >>>kernel supports, but nothing I saw showed me an example of a "tunable"
> > >>>feature that userspace may ask to be turned on or off (e.g per-vcpu).
> > >>>
> > >>>Is there something like that I could use as an example ?
> > >>Sure, we use it all over the place on PPC :).
> > >Allright, I'll grep harder, then :)
> 
> Aah, I think I found it: KVM_ENABLE_CAP, currently only on ppc and s390.
> I'd have to port this over to x86 before I could use it to enable mwait
> on demand.
> 
> Would that be useful/desirable for any other use cases ?
> 
> > >>I think it's perfectly fine to leave mwait always implemented as NOP - 
> > >>it's
> > >>valid behavior.
> > >NOP is valid MWAIT behavior, *unless* MWAIT should generate an invalid
> > >opcode (i.e., if CPUID says mwait not supported). In that respect,
> > >we're cheating only to hook up guests which misbehave. I'd feel less
> > >"dirty" if I could explicitly tell KVM "ok, just this once is OK, but
> > >don't make a habit of it" :)
> > 
> > We don't limit instructions the guest can execute properly anyway. If CPUID
> > doesn't expose AVX, but the host CPU supports AVX, the guest can still call
> > AVX instructions.
> > 
> > So I think we're safe to always handle MWAIT :).
> > 
> > >
> > >>As for the CPUID exposure, that should be a pure QEMU thing. If overriding
> > >>CPUID bits the kernel mask tells us doesn't work today, we should just 
> > >>make
> > >>it possible :).
> > >>
> > >>Eventually I really think that -cpu foo,+mwait,+monitor or whatever the 
> > >>bits
> > >>are should override any safety net that KVM gives us on features it thinks
> > >>are safe to use.
> > >I need to look at the qemu source, doing what you said
> > >(+monitor,+mwait,+whatever) right now "works", doesn't generate an error,
> > >but silently ignores you if it's not implemented. So I'd actually have to
> > >generate a patch to make something happen when they're present on the
> > >command line.
> > >
> > >The part I'm unsure about is "how bad is it to cheat the way we do right
> > >now", vs. "how much is it worth to be pedantic and require explicitly
> > >enabling things, in both qemu and kvm"... I feel like I don't know
> > >enough to 1. have a strong opinion either way, and 2. have my opinion
> > >be *right* :) Which is why I won't let it go already (and thanks for
> > >all your patience, BTW) :)
> > 
> > I think it's sane behavior to not expose the MWAIT capability in the default
> > CPUID mask (which comes from KVM) unless we can actually emulate it properly
> > ;).
> > 
> > However, I think it's very important to be able to force CPUID bits to on
> > from QEMU even when KVM says it doesn't support them. I actually thought we
> > could do that already, but that code got refactored a number of times over
> > the years, so maybe that ability got lost.
> > 
> > Basically KVM gives QEMU 2 ioctls:
> > 
> >   * get list of KVM supported CPUIDs
> >   * set guest exposed CPUIDs
> 
> Ah, so kvm_vcpu_ioctl_set_cpuid() and friends, morally similar to
> kvm_vcpu_ioctl_enable_cap() on ppc, except it turns on cpuid flags
> instead of entire kvm capabilities.
> 
> So we either have
> 
>       1 always-on but masked-by-default monitor/mwait as
>         nop, and enable just the cpuid flag on demand via the
>         existing ioctl_enable_cap() mechanism (and I have to
>         check out the qemu parser for cpuid command-line flags),
> 
> or
> 
>       2 off-by-default monitor/mwait/cpuid-flag, enabled via
>         ioctl_enable_cap(), which would have to first be ported
>         to x86, and would require somewhat more extensive qemu
>         hackery to take advantage of.
> 
> I think I sense a "path of least resistance" here, even though IMHO
> #2 is still "The Right Thing To Do (TM)"  :) :)
> 
> Thanks,
> --Gabriel

I think it's worng.
We really can't emulate mwait at the moment.
All we manage to do is a work-around for broken guests.

So let's not pretend that we can, just enable nop
unconditionally and be done with it.
Paolo already said it's OK with him, and I'll ack too.

Otherwise you are giving bad information to well-behaved guests,
so e.g. linux will try to use mwait. You don't want this.

The advantage is that if at some point CPUs can
actually support mwait in VMs, at that point
we will enable the CPUID bit, and userspace and guests
will be able to detect that and rely on that bit
to mean "mwait works and is efficient".

-- 
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to