Re: [PATCH RFC] virtio-pci: share config interrupt between virtio devices

2014-09-21 Thread Michael S. Tsirkin
On Thu, Sep 18, 2014 at 09:18:37PM +0200, Stefan Fritsch wrote:
 On Monday 01 September 2014 09:37:30, Michael S. Tsirkin wrote:
  Why do we need INT#x?
  How about setting IRQF_SHARED for the config interrupt
  while using MSI-X? You'd have to read ISR to check that the
  interrupt was intended for your device.
 
 The virtio 0.9.5 spec says that ISR is unused when in MSI-X mode. I 
 don't think that you can depend on the device to set the configuration 
 changed bit.
 The virtio 1.0 spec seems to have fixed that.

Yes, virtio 0.9.5 has this bug. But in practice qemu always set this
bit, so for qemu we could do that unconditionally.  Pekka's lkvm tool
doesn't unfortunately.  It's easy to fix that, but it would be nicer to
additionally probe for old versions of the tool, and disable IRQF_SHARED
in that case.

To complicate things, lkvm does not use a distinct subsystem vendor ID,
in spite of the fact the virtio spec always required this explicitly.

After poking at things, we could probably try and distinguish old lkmv
based on bar sizes. I think lkvm has:
#define IOPORT_SIZE 0x400
this is the size of the IO bar (bar0) correct?
Qemu's BAR is smaller.

So if
1. new versions of lkvm are fixed to always set ISR on config change
  even when msi is enabled
2 lkvm folks can promise not to make bar0 size smaller *before*
  fixing (1)

then we could use the heuristic:
bar size == 0x400
to clear IRQF_SHARED.

Cc some lkvm folks for all of the above: would you guys be
happier with some other heuristic?

I'd like to note that lkvm really should get some vendor to request and
then donate a subsystem vendor id (registered with pci sig) for their
use, instead of pretending they are qemu.

AFAIK a subsystem vendor id does not cost money to register, but
only pci sig members can do this, and membership costs $3000.
Maybe we should combine all this with checking subsystem vendor id,
and only implement the optimization if it matches qemu, for now.
This needs some thought.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 1/2] virtio: support for urgent descriptors

2014-09-21 Thread Michael S. Tsirkin
On Wed, Jul 09, 2014 at 09:58:43AM +0930, Rusty Russell wrote:
 Michael S. Tsirkin m...@redhat.com writes:
  Below should be useful for some experiments Jason is doing.
  I thought I'd send it out for early review/feedback.
 
  Compiled-only at this point.
 
 It's not a terrible idea, but it will come down to how effective it is
 in practice.
 
 I'm tempted to make it v1.0 only though.
 
 Cheers,
 Rusty.

Me too.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: share config interrupt between virtio devices

2014-09-21 Thread Stefan Fritsch
On Sunday 21 September 2014 11:09:14, Michael S. Tsirkin wrote:
 On Thu, Sep 18, 2014 at 09:18:37PM +0200, Stefan Fritsch wrote:
  On Monday 01 September 2014 09:37:30, Michael S. Tsirkin wrote:
   Why do we need INT#x?
   How about setting IRQF_SHARED for the config interrupt
   while using MSI-X? You'd have to read ISR to check that the
   interrupt was intended for your device.
 
  
 
  The virtio 0.9.5 spec says that ISR is unused when in MSI-X
  mode. I  don't think that you can depend on the device to set the
  configuration changed bit.
  The virtio 1.0 spec seems to have fixed that.
 
 Yes, virtio 0.9.5 has this bug. But in practice qemu always set this
 bit, so for qemu we could do that unconditionally.  Pekka's lkvm
 tool doesn't unfortunately.  It's easy to fix that, but it would be
 nicer to additionally probe for old versions of the tool, and
 disable IRQF_SHARED in that case.

What about other implementations? I think Linux should try to conform 
to the spec so that all device implementations which conform to the 
spec just work.

One implementation that comes to mind is virtualbox. But from a quick 
look at the source, it seems that it sets the ISR bit always, too. And 
it uses qemu's subsystem vendor id.

But there are other implementations. For example bhyve.


 AFAIK a subsystem vendor id does not cost money to register, but
 only pci sig members can do this, and membership costs $3000.
 Maybe we should combine all this with checking subsystem vendor id,
 and only implement the optimization if it matches qemu, for now.
 This needs some thought.

Maybe the virtio spec should include a way to query the vendor that 
does not involve the pci sig. Maybe use a string? Then no registry 
would be necessary.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: share config interrupt between virtio devices

2014-09-21 Thread Michael S. Tsirkin
On Sun, Sep 21, 2014 at 11:36:44AM +0200, Stefan Fritsch wrote:
 On Sunday 21 September 2014 11:09:14, Michael S. Tsirkin wrote:
  On Thu, Sep 18, 2014 at 09:18:37PM +0200, Stefan Fritsch wrote:
   On Monday 01 September 2014 09:37:30, Michael S. Tsirkin wrote:
Why do we need INT#x?
How about setting IRQF_SHARED for the config interrupt
while using MSI-X? You'd have to read ISR to check that the
interrupt was intended for your device.
  
   
  
   The virtio 0.9.5 spec says that ISR is unused when in MSI-X
   mode. I  don't think that you can depend on the device to set the
   configuration changed bit.
   The virtio 1.0 spec seems to have fixed that.
  
  Yes, virtio 0.9.5 has this bug. But in practice qemu always set this
  bit, so for qemu we could do that unconditionally.  Pekka's lkvm
  tool doesn't unfortunately.  It's easy to fix that, but it would be
  nicer to additionally probe for old versions of the tool, and
  disable IRQF_SHARED in that case.
 
 What about other implementations? I think Linux should try to conform 
 to the spec so that all device implementations which conform to the 
 spec just work.
 
 One implementation that comes to mind is virtualbox. But from a quick 
 look at the source, it seems that it sets the ISR bit always, too. And 
 it uses qemu's subsystem vendor id.
 
 But there are other implementations. For example bhyve.

I couldn't find any code in bhyve that sets VTCFG_ISR_CONF_CHANGED.
Maybe it doesn't generate config changed interrupts?

bhyve sets subsystem vendor to 0 apparently?
We could use that to detect it.

But maybe we should just make it a 1.0 only feature.

 
  AFAIK a subsystem vendor id does not cost money to register, but
  only pci sig members can do this, and membership costs $3000.
  Maybe we should combine all this with checking subsystem vendor id,
  and only implement the optimization if it matches qemu, for now.
  This needs some thought.
 
 Maybe the virtio spec should include a way to query the vendor that 
 does not involve the pci sig. Maybe use a string? Then no registry 
 would be necessary.

We can make the requirement for the vendor specific ID stronger in 1.0,
SHOULD instead of MAY.

But it seems that people will still copy-paste working code
across hypervisors, I'm not sure this can be helped.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Standardizing an MSR or other hypercall to get an RNG seed?

2014-09-21 Thread Paolo Bonzini
Il 19/09/2014 22:46, Andy Lutomirski ha scritto:
 
  However, it sounds to me that at least for KVM, it is very easy just to 
  emulate the RDRAND instruction. The hypervisor would report to the guest 
  that RDRAND is supported in CPUID and the emulate the instruction when 
  guest executes it. KVM already traps guest #UD (which would occur if 
  RDRAND executed while it is not supported) - so this scheme wouldn’t 
  introduce additional overhead over RDMSR.
 Because then guest user code will think that rdrand is there and will
 try to use it, resulting in abysmal performance.

KVM could expose a CPUID leaf that says RDRAND is not there, but if you
execute it the hypervisor will try to do something slow but sane.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: share config interrupt between virtio devices

2014-09-21 Thread Sasha Levin
On 09/21/2014 04:09 AM, Michael S. Tsirkin wrote:
 The virtio 0.9.5 spec says that ISR is unused when in MSI-X mode. I 
  don't think that you can depend on the device to set the configuration 
  changed bit.
  The virtio 1.0 spec seems to have fixed that.
 Yes, virtio 0.9.5 has this bug. But in practice qemu always set this
 bit, so for qemu we could do that unconditionally.  Pekka's lkvm tool
 doesn't unfortunately.  It's easy to fix that, but it would be nicer to
 additionally probe for old versions of the tool, and disable IRQF_SHARED
 in that case.

 To complicate things, lkvm does not use a distinct subsystem vendor ID,
 in spite of the fact the virtio spec always required this explicitly.

I think I may be a bit confused here, but AFAIK we do set subsystem vendor
ID properly for our virtio-pci devices?

vpci-pci_hdr = (struct pci_device_header) {
.vendor_id  = 
cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET),
.device_id  = cpu_to_le16(device_id),
[...]
.subsys_vendor_id   = 
cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET),


Thanks,
Sasha
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


VM lazy restore

2014-09-21 Thread Prateek Sharma
Hi all,
I was wondering if there is a way to restore a VM lazily (load
pages from disk on demand, instead of loading all pages at once).
   Alternatively, is there a way to get a trace of pages accessed by
the VM? MMU_notifiers, qemu-tracepoints?

Thanks!
--Prateek
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: share config interrupt between virtio devices

2014-09-21 Thread Michael S. Tsirkin
On Sun, Sep 21, 2014 at 09:47:51AM -0400, Sasha Levin wrote:
 On 09/21/2014 04:09 AM, Michael S. Tsirkin wrote:
  The virtio 0.9.5 spec says that ISR is unused when in MSI-X mode. I 
   don't think that you can depend on the device to set the configuration 
   changed bit.
   The virtio 1.0 spec seems to have fixed that.
  Yes, virtio 0.9.5 has this bug. But in practice qemu always set this
  bit, so for qemu we could do that unconditionally.  Pekka's lkvm tool
  doesn't unfortunately.  It's easy to fix that, but it would be nicer to
  additionally probe for old versions of the tool, and disable IRQF_SHARED
  in that case.
 
  To complicate things, lkvm does not use a distinct subsystem vendor ID,
  in spite of the fact the virtio spec always required this explicitly.
 
 I think I may be a bit confused here, but AFAIK we do set subsystem vendor
 ID properly for our virtio-pci devices?
 
 vpci-pci_hdr = (struct pci_device_header) {
 .vendor_id  = 
 cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET),
 .device_id  = cpu_to_le16(device_id),
   [...]
 .subsys_vendor_id   = 
 cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET),
 
 
 Thanks,
 Sasha


Yes but the spec says:
The Subsystem Vendor ID should reflect the PCI Vendor ID of the 
environment.

IOW lkvm shouldn't reuse the ID from qemu, it should have its own
(qemu and lkvm hypervisors being a different environment).

virtio 1.0 have weakened this requirement:
The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY
reflect the PCI Vendor and Device
ID of the environment (for informational purposes by the driver).

I reasoned that since it's for informational purposes only, there's no
reason to make it a SHOULD.

It might or might not be a good idea to change it back, worth
considering.


-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: share config interrupt between virtio devices

2014-09-21 Thread Sasha Levin
On 09/21/2014 11:02 AM, Michael S. Tsirkin wrote:
 On Sun, Sep 21, 2014 at 09:47:51AM -0400, Sasha Levin wrote:
  On 09/21/2014 04:09 AM, Michael S. Tsirkin wrote:
   The virtio 0.9.5 spec says that ISR is unused when in MSI-X mode. I 
don't think that you can depend on the device to set the 
configuration 
changed bit.
The virtio 1.0 spec seems to have fixed that.
   Yes, virtio 0.9.5 has this bug. But in practice qemu always set this
   bit, so for qemu we could do that unconditionally.  Pekka's lkvm tool
   doesn't unfortunately.  It's easy to fix that, but it would be nicer to
   additionally probe for old versions of the tool, and disable IRQF_SHARED
   in that case.
  
   To complicate things, lkvm does not use a distinct subsystem vendor ID,
   in spite of the fact the virtio spec always required this explicitly.
  
  I think I may be a bit confused here, but AFAIK we do set subsystem vendor
  ID properly for our virtio-pci devices?
  
  vpci-pci_hdr = (struct pci_device_header) {
  .vendor_id  = 
  cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET),
  .device_id  = cpu_to_le16(device_id),
 [...]
  .subsys_vendor_id   = 
  cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET),
  
  
  Thanks,
  Sasha
 
 Yes but the spec says:
   The Subsystem Vendor ID should reflect the PCI Vendor ID of the 
 environment.
 
 IOW lkvm shouldn't reuse the ID from qemu, it should have its own
 (qemu and lkvm hypervisors being a different environment).
 
 virtio 1.0 have weakened this requirement:
   The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY
   reflect the PCI Vendor and Device
   ID of the environment (for informational purposes by the driver).
 
 I reasoned that since it's for informational purposes only, there's no
 reason to make it a SHOULD.
 
 It might or might not be a good idea to change it back, worth
 considering.

Ow. The 0.9.5 spec also says:

(it's currently only used for informational purposes by the guest).

That and the combination of should rather then must (recommended rather than
required) prompted us to just put something that works in there and leave it be.


Thanks,
Sasha
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC] virtio-pci: share config interrupt between virtio devices

2014-09-21 Thread Michael S. Tsirkin
On Sun, Sep 21, 2014 at 11:19:36AM -0400, Sasha Levin wrote:
 On 09/21/2014 11:02 AM, Michael S. Tsirkin wrote:
  On Sun, Sep 21, 2014 at 09:47:51AM -0400, Sasha Levin wrote:
   On 09/21/2014 04:09 AM, Michael S. Tsirkin wrote:
The virtio 0.9.5 spec says that ISR is unused when in MSI-X mode. 
I 
 don't think that you can depend on the device to set the 
 configuration 
 changed bit.
 The virtio 1.0 spec seems to have fixed that.
Yes, virtio 0.9.5 has this bug. But in practice qemu always set this
bit, so for qemu we could do that unconditionally.  Pekka's lkvm tool
doesn't unfortunately.  It's easy to fix that, but it would be nicer 
to
additionally probe for old versions of the tool, and disable 
IRQF_SHARED
in that case.
   
To complicate things, lkvm does not use a distinct subsystem vendor 
ID,
in spite of the fact the virtio spec always required this explicitly.
   
   I think I may be a bit confused here, but AFAIK we do set subsystem 
   vendor
   ID properly for our virtio-pci devices?
   
   vpci-pci_hdr = (struct pci_device_header) {
   .vendor_id  = 
   cpu_to_le16(PCI_VENDOR_ID_REDHAT_QUMRANET),
   .device_id  = cpu_to_le16(device_id),
[...]
   .subsys_vendor_id   = 
   cpu_to_le16(PCI_SUBSYSTEM_VENDOR_ID_REDHAT_QUMRANET),
   
   
   Thanks,
   Sasha
  
  Yes but the spec says:
  The Subsystem Vendor ID should reflect the PCI Vendor ID of the 
  environment.
  
  IOW lkvm shouldn't reuse the ID from qemu, it should have its own
  (qemu and lkvm hypervisors being a different environment).
  
  virtio 1.0 have weakened this requirement:
  The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY
  reflect the PCI Vendor and Device
  ID of the environment (for informational purposes by the driver).
  
  I reasoned that since it's for informational purposes only, there's no
  reason to make it a SHOULD.
  
  It might or might not be a good idea to change it back, worth
  considering.
 
 Ow. The 0.9.5 spec also says:
 
   (it's currently only used for informational purposes by the guest).
 
 That and the combination of should rather then must (recommended rather 
 than
 required) prompted us to just put something that works in there and leave it 
 be.
 
 
 Thanks,
 Sasha

Note currently as well as should which means before you don't, make
sure you understand the implications.


-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] x86:kvm: fix two typos in comment

2014-09-21 Thread Tiejun Chen
s/drity/dirty and s/vmsc01/vmcs01

Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
 arch/x86/kvm/mmu.c | 2 +-
 arch/x86/kvm/vmx.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 76398fe..f76bc19 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1174,7 +1174,7 @@ static void drop_large_spte(struct kvm_vcpu *vcpu, u64 
*sptep)
  * Write-protect on the specified @sptep, @pt_protect indicates whether
  * spte write-protection is caused by protecting shadow page table.
  *
- * Note: write protection is difference between drity logging and spte
+ * Note: write protection is difference between dirty logging and spte
  * protection:
  * - for dirty logging, the spte can be set to writable at anytime if
  *   its dirty bitmap is properly set.
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6ffd643..305e667 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8008,7 +8008,7 @@ static void vmx_start_preemption_timer(struct kvm_vcpu 
*vcpu)
 /*
  * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested
  * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function merges it
- * with L0's requirements for its guest (a.k.a. vmsc01), so we can run the L2
+ * with L0's requirements for its guest (a.k.a. vmcs01), so we can run the L2
  * guest in a way that will both be appropriate to L1's requests, and our
  * needs. In addition to modifying the active vmcs (which is vmcs02), this
  * function also has additional necessary side-effects, like setting various
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH RFC 2/2] vhost: support urgent descriptors

2014-09-21 Thread Jason Wang
On 09/20/2014 06:00 PM, Paolo Bonzini wrote:
 Il 19/09/2014 09:10, Jason Wang ha scritto:
  
 -  if (!vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX)) {
 +  if (vq-urgent || !vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX)) {
 So the urgent descriptor only work when event index was not enabled?
 This seems suboptimal, we may still want to benefit from event index
 even if urgent descriptor is used. Looks like we need return true here
 when vq-urgent is true?
 Its ||, not .

 Without event index, all descriptors are treated as urgent.

 Paolo


The problem is if vq-urgent is true, the patch checks
VRING_AVAIL_F_NO_INTERRUPT bit. This bit were set unconditionally in
virtqueue_enable_cb() regardless of event index feature and cleared
unconditionally in virtqueue_disable_cb(). So virtqueue_enable_cb() was
used to not only publish a new event index but also enable the urgent
descriptor. And virtqueue_disable_cb() disabled all interrupts including
the urgent descriptor. Guest won't get urgent interrupts by just adding
virtqueue_add_outbuf_urgent() since what it needs is to enable and
disable interrupt for !urgent descriptor.

Btw, not sure urgent is a suitable name, since interrupt is often slow
in kvm guest. And in fact virtio-net will probably use urgent
descriptor for those packets (e.g stream packets who can be delayed a
little bit to batch more bytes from userspace) who was not urgent
compared to other packets.


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Standardizing an MSR or other hypercall to get an RNG seed?

2014-09-21 Thread Alok Kataria
Hi Andy,

On Fri, 2014-09-19 at 11:20 -0700, Andy Lutomirski wrote:
 [cc: Alok Kataria at VMware]
 
 On Fri, Sep 19, 2014 at 11:12 AM, Gleb Natapov g...@kernel.org wrote:
  On Fri, Sep 19, 2014 at 11:02:38AM -0700, Andy Lutomirski wrote:
  On Fri, Sep 19, 2014 at 10:49 AM, Gleb Natapov g...@kernel.org wrote:
   On Fri, Sep 19, 2014 at 10:18:37AM -0700, H. Peter Anvin wrote:
   On 09/19/2014 10:15 AM, Gleb Natapov wrote:
On Fri, Sep 19, 2014 at 10:08:20AM -0700, H. Peter Anvin wrote:
On 09/19/2014 09:53 AM, Gleb Natapov wrote:
On Fri, Sep 19, 2014 at 09:40:07AM -0700, H. Peter Anvin wrote:
On 09/19/2014 09:37 AM, Gleb Natapov wrote:
   
Linux detects what hypervior it runs on very early
   
Not anywhere close to early enough.  We're talking for uses like 
kASLR.
   
Still to early to do:
   
   h = cpuid(HYPERVIOR_SIGNATURE)
   if (h == KVMKVMKVM) {
  if (cpuid(kvm_features)  kvm_rnd)
 rdmsr(kvm_rnd)
   else (h == HyperV) {
  if (cpuid(hv_features)  hv_rnd)
rdmsr(hv_rnd)
   else (h == XenXenXen) {
  if (cpuid(xen_features)  xen_rnd)
rdmsr(xen_rnd)
  }
   
   
If we need to do chase loops, especially not so...
   
What loops exactly? As a non native English speaker I fail to 
understand
if your answer is yes or no ;)
   
  
   The above isn't actually the full algorithm used.
  
   What part of actually algorithm cannot be implemented? Loop that searches
   for KVM leaf in case KVM pretend to be HyperV (is this what you called
   chase loops?)? First of all there is no need to implement it, if KVM
   pretends to be HyperV use HyperV's way to obtain RNG, but what is the
   problem with the loop?
  
 
  It can be implemented, and I've done it.  But it's a mess.  Almost the
  very first thing we do in boot (even before decompressing the kernel)
  will be to scan a bunch of cpuid leaves looking for a hypervisor with
  an rng source that we can use for kASLR.  And we'll have to update
  that code and make it bigger every time another hypervisor adds
  exactly the same feature.
  IMO implementing this feature is in hypervisor's best interest, so the task
  of updating the code will scale by virtue of hypervisor's developers each
  adding it for hypervisor he cares about.
 
 I assume that you mean guest, not hypervisor.
 
 
 
  And then we have another copy of almost exactly the same code in the
  normal post-boot part of the kernel.
 
  We can certainly do this, but I'd much rather solve the problem once
  and let all of the hypervisors and guests opt in and immediately be
  compatible with each other.
 
   I forgot VMware because I do not see VMware people to be CCed. They may
   be even less excited about them being told _how_ this feature need to be
   implemented (e.g implement HyperV leafs for the feature detection). I
   do not want to and cannot speak for VMware, but my guess is that for
   them it would be much easier to add an else clause for VMware in above
   if then to coordinate with all hypervisor developers about MSR/cpuid
   details. And since this is security feature implementing it for Linux
   is in their best interest.
 
  Do you know any of them who should be cc'd?
 
  No, not anyone in particular. git log arch/x86/kernel/cpu/vmware.c may help.
 
  But VMware is an elephant in the room here. There are other hypervisors out 
  there.
  VirtualBox, bhyve...
 
 Exactly.  The amount of effort to get everything to be compatible with
 everything scales quadratically in the number of hypervisors, and the
 probability that some combination is broken also increases.
 
 If we can get everyone to back something common here then this problem
 goes away.

There was a similar attempt few years back [1], to standardize on the
hypervisor cpuid space. Though a few of them were interested, getting
all hypervisor vendors to agree (actually even discuss this) turned out
to be a futile exercise. Don't mean to discourage you, but what I
learned from that attempt was that it's very difficult to standardize
unless the hardware vendors are proposing it.

In anycase can you point me to a mail which discusses the specifics of
the interface you are proposing ? 

Alok

[1] - http://thread.gmane.org/gmane.comp.emulators.kvm.devel/22643
  https://lkml.org/lkml/2008/9/26/351


N�r��yb�X��ǧv�^�)޺{.n�+h����ܨ}���Ơz�j:+v���zZ+��+zf���h���~i���z��w���?��)ߢf