Re: cpu hotplug

2010-09-21 Thread Gleb Natapov
On Mon, Sep 20, 2010 at 09:07:16PM -0400, Kevin O'Connor wrote:
 On Mon, Sep 20, 2010 at 08:50:17AM +0200, Gleb Natapov wrote:
  On Sun, Sep 19, 2010 at 06:03:31PM -0400, Kevin O'Connor wrote:
   I was wrong.  The cpu_set x offline does send an event to the guest
   OS.  SeaBIOS even forwards the event along - as far as I can tell a
   Notify(CPxx, 3) event is generated by SeaBIOS.
   
   My Windows 7 ultimate beta seems to receive the event okay (it pops up
   a dialog box which says you can't unplug cpus).
   
  It may react to Eject() method.
 
 The eject method is called by the OS to notify the host.  Right now
 SeaBIOS's eject method doesn't do anything.
 
Yes. What I meant is it may react on presence of Eject() method. In my
experience Windows consider all devices with Eject() method as
hot-pluggable. But actually IIRC Windows 7 gave me this dialog box with
BOCHS BIOS too and there we didn't have Eject() method.

   Unfortunately, my test linux guest OS (FC13) doesn't seem to do
   anything with the unplug Notify event.  I've tried with the original
   FC13 and with a fully updated version - no luck.
   
   So, I'm guessing this has something to do with the guest OS.
   
  Can you verify that _STA() return zero after cpu unplug?
 
 I've verified that.  I've also verified that Linux doesn't call the
 _STA method after Notify(CPxx, 3).  It does call _STA on startup and
 after a Notify(CPxx, 1) event.  So, the Linux kernel in my FC13 guest
 just seems to be ignoring Notify(3) events.  (According to ACPI spec,
 the guest should shutdown the cpu and then call the eject method.)
 
In older kernels _STA was called on Notify(3), but recently cpu hot-plug
in Linux was changed. Can you check what happens if you'll call
Notify(1) on unplug? Spec says that the value is:

 Device Check. Used to notify OSPM that the device either
 appeared or disappeared.

so may be it should be called on hot-plug and hot-unplug.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] vhost-net: add dhclient work-around from userspace

2010-09-21 Thread xming
 Newer versions of dhclient should also be OK: they detect
 that checksum is missing in the packet. Try it e.g. with
 a recent fedora guest as a client.

I don't have fedora, but with the latest release (4.1.1-P1) on isc.org
it still behaves the same (see output at the bottom).

 To solve the problem for old clients, recent kernels and iptables have
 support for CHECKSUM target.

 You can use this target to compute and fill in the checksum in
 a packet that lacks a checksum.

 Typical expected use:
 iptables -A POSTROUTING -t mangle -p udp --dport bootpc \
 -j CHECKSUM --checksum-fill

Nice trick :D

 libvirt will program these for you if it sets up the server,
 maybe there needs a flag to tell it that server is local.

I don't use libvirt.

My point is, there doesn't seem to be much working client and the only
working client is a ver very old one (pump), newer client do not work,
as opposite to what you have explained.

To repeat myself, here is the situation:

- DHCP server with vhost_net, all client w/o vhost_net work, clients
with vhost_net do not work except pump
- DHCP server w/o vhost_net, all clients work
- physical DHCP server, client with vhost *do* work.


--- output of the lates DHCP client ---

Internet Systems Consortium DHCP Client 4.1.1-P1
Copyright 2004-2010 Internet Systems Consortium.
All rights reserved.
For info, please visit https://www.isc.org/software/dhcp/

Listening on LPF/eth0/00:16:3e:00:07:01
Sending on   LPF/eth0/00:16:3e:00:07:01
Sending on   Socket/fallback
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 6
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 13
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 14
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 10
DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 18
5 bad udp checksums in 5 packets
No DHCPOFFERS received.
No working leases in persistent database - sleeping.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call agenda for Sept 21

2010-09-21 Thread Avi Kivity

 On 09/21/2010 05:37 AM, Nakajima, Jun wrote:

Avi Kivity wrote on Mon, 20 Sep 2010 at 09:50:55:

On 09/20/2010 06:44 PM, Chris Wright wrote:
  Please send in any agenda items you are interested in covering.

   nested vmx: the resurrection.  Nice to see it progressing again, but
  there's still a lot of ground to cover.  Perhaps we can involve Intel to
  speed things up?

Hi, Avi

What are you looking for?




Help in getting the patchset in.  Reviewing is always appreciated (while 
it tends to increase the time, the result is usually better).  If we can 
find a way to share the work, even better.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] KVM: VMX: Emulated real mode interrupt injection

2010-09-21 Thread Avi Kivity

 On 09/20/2010 07:30 PM, Marcelo Tosatti wrote:

   static void __vmx_complete_interrupts(struct vcpu_vmx *vmx,
u32 idt_vectoring_info,
int instr_len_field,
  @@ -3864,9 +3814,6 @@ static void __vmx_complete_interrupts(struct vcpu_vmx 
*vmx,
int type;
bool idtv_info_valid;

  - if (vmx-rmode.irq.pending)
  - fixup_rmode_irq(vmx,idt_vectoring_info);
  -

Don't you have to undo kvm_inject_realmode_interrupt if injection fails?




Injection cannot fail (at least, in the same sense as the vmx 
injections).  It's actually not about failures, it's about guest entry 
being cancelled due to a signal or some KVM_REQ that needs attention.  
For vmx style injections, we need to undo the injection to keep things 
in a consistent state.  To realmode emulated injection, everything is in 
a consistent state already, so no need to undo anything (it's also 
impossible, since we overwrote memory on the stack).



--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] support piix PAM registers in KVM

2010-09-21 Thread Gleb Natapov
Without this BIOS fails to remap 0xf memory from ROM to RAM so writes
to F-segment modify ROM content instead of memory copy. Since QEMU does
not reloads ROMs during reset on next boot modified copy of BIOS is used.

Signed-off-by: Gleb Natapov g...@redhat.com
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 933ad86..0bf435d 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -99,10 +99,6 @@ static void i440fx_update_memory_mappings(PCII440FXState *d)
 int i, r;
 uint32_t smram, addr;
 
-if (kvm_enabled()) {
-/* FIXME: Support remappings and protection changes. */
-return;
-}
 update_pam(d, 0xf, 0x10, (d-dev.config[I440FX_PAM]  4)  3);
 for(i = 0; i  12; i++) {
 r = (d-dev.config[(i  1) + (I440FX_PAM + 1)]  ((i  1) * 4))  3;
--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Tracing KVM with Systemtap

2010-09-21 Thread Rayson Ho
On Mon, 2010-09-20 at 14:36 +0100, Stefan Hajnoczi wrote:
 Right now there are few pre-defined probes (trace events in QEMU
 tracing speak).  As I develop I try to be mindful of new ones I create
 and whether they would be generally useful.  I intend to contribute
 more probes and hope others will too!

I am still looking at/hacking the QEMU code. I have looked at the
following places in the code that I think can be useful to have
statistics gathered:

net.c qemu_deliver_packet(), etc - network statistics

CPU Arch/op_helper.c global_cpu_lock(), tlb_fill() - lock  unlock,
and TLB refill statistics

balloon.c, hw/virtio-balloon.c - ballooning information.

Besides the ballooning part, which I know what it is but don't fully
understand how it works, the other parts can be implemented as Systemtap
tapsets (~ DTrace scripts) in the initial stage.

I will see what other probes are useful for the end users. Also, are
there developer documentations for KVM? (I googled but found a lot of
presentations about KVM but not a lot of info about the internals.)

Rayson



 
 Prerna is also looking at adding useful probes.
 
 Stefan


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v9 12/16] Add mp(mediate passthru) device.

2010-09-21 Thread Michael S. Tsirkin
On Tue, Sep 21, 2010 at 09:39:31AM +0800, Xin, Xiaohui wrote:
 From: Michael S. Tsirkin [mailto:m...@redhat.com]
 Sent: Monday, September 20, 2010 7:37 PM
 To: Xin, Xiaohui
 Cc: net...@vger.kernel.org; kvm@vger.kernel.org; 
 linux-ker...@vger.kernel.org;
 mi...@elte.hu; da...@davemloft.net; herb...@gondor.hengli.com.au;
 jd...@linux.intel.com
 Subject: Re: [RFC PATCH v9 12/16] Add mp(mediate passthru) device.
 
 On Mon, Sep 20, 2010 at 04:08:48PM +0800, xiaohui@intel.com wrote:
  From: Xin Xiaohui xiaohui@intel.com
 
  ---
  Michael,
  I have move the ioctl to configure the locked memory to vhost
 
 It's ok to move this to vhost but vhost does not
 know how much memory is needed by the backend.
 
 I think the backend here you mean is mp device.
 Actually, the memory needed is related to vq-num to run zero-copy
 smoothly.
 That means mp device did not know it but vhost did.

Well, this might be so if you insist on locking
all posted buffers immediately. However, let's assume I have a
very large ring and prepost a ton of RX buffers:
there's no need to lock all of them directly:

if we have buffers A and B, we can lock A, pass it
to hardware, and when A is consumed unlock A, lock B
and pass it to hardware.


It's not really critical. But note we can always have userspace
tell MP device all it wants to know, after all.

 And the rlimt stuff is per process, we use current pointer to set
 and check the rlimit, the operations should be in the same process.

Well no, the ring is handled from the kernel thread: we switch the mm to
point to the owner task so copy from/to user and friends work, but you
can't access the rlimit etc.

 Now the check operations are in vhost process, as mp_recvmsg() or
 mp_sendmsg() are called by vhost.

Hmm, what do you mean by the check operations?
send/recv are data path operations, they shouldn't
do any checks, should they?

 So set operations should be in
 vhost process too, it's natural.
 
 So I think we'll need another ioctl in the backend
 to tell userspace how much memory is needed?
 
 Except vhost tells it to mp device, mp did not know
 how much memory is needed to run zero-copy smoothly.
 Is userspace interested about the memory mp is needed?

Couldn't parse this last question.
I think userspace generally does want control over
how much memory we'll lock. We should not just lock
as much as we can.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Tracing KVM with Systemtap

2010-09-21 Thread Stefan Hajnoczi
On Tue, Sep 21, 2010 at 1:58 PM, Rayson Ho r...@redhat.com wrote:
 On Mon, 2010-09-20 at 14:36 +0100, Stefan Hajnoczi wrote:
 Right now there are few pre-defined probes (trace events in QEMU
 tracing speak).  As I develop I try to be mindful of new ones I create
 and whether they would be generally useful.  I intend to contribute
 more probes and hope others will too!

 I am still looking at/hacking the QEMU code. I have looked at the
 following places in the code that I think can be useful to have
 statistics gathered:

 net.c qemu_deliver_packet(), etc - network statistics

Yes.

 CPU Arch/op_helper.c global_cpu_lock(), tlb_fill() - lock  unlock,
 and TLB refill statistics

These are not relevant to KVM, they are only used when running with
KVM disabled (TCG mode).

 balloon.c, hw/virtio-balloon.c - ballooning information.

Prerna added a balloon event which is in qemu.git trace-events.  Does
that one do what you need?

 I will see what other probes are useful for the end users. Also, are
 there developer documentations for KVM? (I googled but found a lot of
 presentations about KVM but not a lot of info about the internals.)

Not really.  I suggest grabbing the source and following vl.c:main()
to the main KVM execution code.

Stefan
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


how CPU hot-plug is suppose to work on Linux?

2010-09-21 Thread Gleb Natapov
Hello,

We are trying to add CPU hot-plug/unplug capability to KVM. We want to
be able to initiate hot-plug/unplug from a host. Our current schema
works like this:

We have Processor object in DSDT for each potentially available CPU.
Each Processor object has _MAD, _STA, _EJ0. _MAD of present CPU returns
enabled LAPIC structure. _STA of present CPU return 0xf. _MADT of non
present CPU returns disabled LAPIC. _STA returns 0x0. _EJ0 does nothing.

When CPU is hot plugged:

 1. Bit is set in sts register of gpe
 2. acpi interrupt is sent
 3. Linux ACPI evaluates corespondent gpe's _L() method
 4. _L() method determines which CPU's status is changed
 5. For each CPU that changed status from not present to present
call Notify(1) to corespondent Processor() object.

When CPU is hot unplugged:

 1. Bit is set in sts register of gpe
 2. acpi interrupt is sent
 3. Linux ACPI evaluates corespondent gpe's _L() method
 4. _L() method determines which CPU's status is changed
 5. For each CPU that changed status from present to non present
call Notify(3) to corespondent Processor() object.

Now, CPU hot plug appears to be working. But CPU hot unplug does
nothing. I expect that Linux will offline CPU and eject it after
evaluating Notify(3) and seeing that _STA of ejected CPU returns
0x0 now.

Any ideas how it is suppose to work?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] device-assignment: register a reset function

2010-09-21 Thread Bernhard Kohl

Am 17.09.2010 18:16, schrieb ext Alex Williamson:

On Fri, 2010-09-17 at 17:27 +0200, Bernhard Kohl wrote:
   

This is necessary because during reboot of a VM the assigned devices
continue DMA transfers which causes memory corruption.

Signed-off-by: Thomas Ostlerthomas.ost...@nsn.com
Signed-off-by: Bernhard Kohlbernhard.k...@nsn.com
---
  hw/device-assignment.c |   14 ++
  1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index 87f7418..fb47813 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -1450,6 +1450,17 @@ static void 
assigned_dev_unregister_msix_mmio(AssignedDevice *dev)
  dev-msix_table_page = NULL;
  }

+static void reset_assigned_device(void *opaque)
+{
+PCIDevice *d = (PCIDevice *)opaque;
+uint32_t conf;
+
+/* reset the bus master bit to avoid further DMA transfers */
+conf = assigned_dev_pci_read_config(d, PCI_COMMAND, 2);
+conf= ~PCI_COMMAND_MASTER;
+assigned_dev_pci_write_config(d, PCI_COMMAND, conf, 2);
+}
+
  static int assigned_initfn(struct PCIDevice *pci_dev)
  {
  AssignedDevice *dev = DO_UPCAST(AssignedDevice, dev, pci_dev);
@@ -1499,6 +1510,9 @@ static int assigned_initfn(struct PCIDevice *pci_dev)
  if (r  0)
  goto assigned_out;

+/* register reset function for the device */
+qemu_register_reset(reset_assigned_device, pci_dev);
+
  /* intercept MSI-X entry page in the MMIO */
  if (dev-cap.available  ASSIGNED_DEVICE_CAP_MSIX)
  if (assigned_dev_register_msix_mmio(dev))
 

Hmm, at a minimum, we need a qemu_unregister_reset() in the exitfn, but
upon further inspection, we should probably just do it the qdev way.
That would mean simply setting qdev.reset to reset_assigned_device() in
assign_info, then we can leave the registration/de-registration to qdev.
Does that work?  Sorry I missed that the first time.  Thanks,

Alex
   


OK, we will rework the patch for qdev.
This might take 2 weeks because of vacation.

Thanks
Bernhard
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: x86: mmu: fix counting of rmap entries in rmap_add()

2010-09-21 Thread Marcelo Tosatti
On Sat, Sep 18, 2010 at 08:41:02AM +0800, Hillf Danton wrote:
 It seems that rmap entries are under counted.
 
 Signed-off-by: Hillf Danton dhi...@gmail.com
 ---

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] KVM: VMX: Emulated real mode interrupt injection

2010-09-21 Thread Marcelo Tosatti
On Tue, Sep 21, 2010 at 01:56:50PM +0200, Avi Kivity wrote:
  On 09/20/2010 07:30 PM, Marcelo Tosatti wrote:
static void __vmx_complete_interrupts(struct vcpu_vmx *vmx,
 u32 idt_vectoring_info,
 int instr_len_field,
   @@ -3864,9 +3814,6 @@ static void __vmx_complete_interrupts(struct 
  vcpu_vmx *vmx,
 int type;
 bool idtv_info_valid;
 
   - if (vmx-rmode.irq.pending)
   - fixup_rmode_irq(vmx,idt_vectoring_info);
   -
 
 Don't you have to undo kvm_inject_realmode_interrupt if injection fails?
 
 
 
 Injection cannot fail (at least, in the same sense as the vmx
 injections).  It's actually not about failures, it's about guest
 entry being cancelled due to a signal or some KVM_REQ that needs
 attention.  For vmx style injections, we need to undo the injection
 to keep things in a consistent state.  To realmode emulated
 injection, everything is in a consistent state already, so no need
 to undo anything (it's also impossible, since we overwrote memory on
 the stack).

Aren't you going to push EFLAGS,CS,EIP on the stack twice if that
occurs?

Yes, can't undo it...
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/4] KVM: VMX: Emulated real mode interrupt injection

2010-09-21 Thread Avi Kivity

 On 09/21/2010 05:36 PM, Marcelo Tosatti wrote:

On Tue, Sep 21, 2010 at 01:56:50PM +0200, Avi Kivity wrote:
   On 09/20/2010 07:30 PM, Marcelo Tosatti wrote:
  static void __vmx_complete_interrupts(struct vcpu_vmx *vmx,
u32 idt_vectoring_info,
int instr_len_field,
 @@ -3864,9 +3814,6 @@ static void __vmx_complete_interrupts(struct 
vcpu_vmx *vmx,
int type;
bool idtv_info_valid;
  
 -  if (vmx-rmode.irq.pending)
 -  fixup_rmode_irq(vmx,idt_vectoring_info);
 -
  
  Don't you have to undo kvm_inject_realmode_interrupt if injection fails?
  
  

  Injection cannot fail (at least, in the same sense as the vmx
  injections).  It's actually not about failures, it's about guest
  entry being cancelled due to a signal or some KVM_REQ that needs
  attention.  For vmx style injections, we need to undo the injection
  to keep things in a consistent state.  To realmode emulated
  injection, everything is in a consistent state already, so no need
  to undo anything (it's also impossible, since we overwrote memory on
  the stack).

Aren't you going to push EFLAGS,CS,EIP on the stack twice if that
occurs?



No, since we clear the pending flag (we do that even for vmx-injected 
interrupts; then cancel or injection failure re-sets the flag).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/9] msix: move definitions from msix.c to msix.h

2010-09-21 Thread Avi Kivity

 On 09/20/2010 06:56 PM, Michael S. Tsirkin wrote:

On Mon, Sep 20, 2010 at 05:06:45PM +0200, Avi Kivity wrote:
  This allows us to reuse them from the kvm support code.

  Signed-off-by: Avi Kivitya...@redhat.com

I would rather all dealings with MSI-X table stayed in one place. All we
need is just the entry, so let's add APIs to retrieve MSIX address and
data:

uint64_t msix_get_address(dev, vector)
uint32_t msix_get_data(dev, vector)

and that will be enough for KVM.



Ok, will do.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/9] Protect qemu-kvm.h declarations with NEED_CPU_H

2010-09-21 Thread Avi Kivity

 On 09/20/2010 07:05 PM, Michael S. Tsirkin wrote:

On Mon, Sep 20, 2010 at 05:06:49PM +0200, Avi Kivity wrote:
  Target-specific definitions need to be qualified with NEED_CPU_H so kvm.h
  can be included from non-target-specific files.

  Signed-off-by: Avi Kivitya...@redhat.com

Long term, would be cleaner to split this into two files ...


Yes, this is a pain to deal with.


  ---
   kvm-stub.c |1 +
   qemu-kvm.h |   21 -
   2 files changed, 21 insertions(+), 1 deletions(-)

  diff --git a/kvm-stub.c b/kvm-stub.c
  index 37d2b7a..2e4bf00 100644
  --- a/kvm-stub.c
  +++ b/kvm-stub.c
  @@ -169,3 +169,4 @@ bool kvm_msix_notify(PCIDevice *dev, unsigned vector)
   {
   return false;
   }
  +

intentional?



Nope.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9] msix/kvm integration cleanups

2010-09-21 Thread Avi Kivity

 On 09/20/2010 07:02 PM, Michael S. Tsirkin wrote:

On Mon, Sep 20, 2010 at 05:06:41PM +0200, Avi Kivity wrote:
  This cleans up msix/kvm integration a bit.  The really important patch is the
  last one, which allows msix.o to be part of non-target-specific build.

I actually thoought this later move should be done in a different way:
- add all functions msix uses to kvm-stub.c


Isn't that what I did?


- kvm_irq_routing_entry should also have a stub

I sent some minor comments in case you have a reason
to prefer this way.


My motivation is really the last patch.  If you explain what you'd like 
to see I'll try to do it.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kvm networking todo wiki

2010-09-21 Thread Michael S. Tsirkin
I've put up a wiki page with a kvm networking todo list,
mainly to avoid effort duplication, but also in the hope
to draw attention to what I think we should try addressing
in KVM:

http://www.linux-kvm.org/page/NetworkingTodo

This page could cover all networking related activity in KVM,
currently most info is related to virtio-net.

Note: if there's no developer listed for an item,
this just means I don't know of anyone actively working
on an issue at the moment, not that no one intends to.

I would appreciate it if others working on one of the items on this list
would add their names so we can communicate better.  If others like this
wiki page, please go ahead and add stuff you are working on if any.

It would be especially nice to add autotest projects:
there is just a short test matrix and a catch-all
'Cover test matrix with autotest', currently.

Currently there are some links to Red Hat bugzilla entries,
feel free to add links to other bugzillas.

Thanks!

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9] msix/kvm integration cleanups

2010-09-21 Thread Michael S. Tsirkin
On Tue, Sep 21, 2010 at 06:05:10PM +0200, Avi Kivity wrote:
  On 09/20/2010 07:02 PM, Michael S. Tsirkin wrote:
 On Mon, Sep 20, 2010 at 05:06:41PM +0200, Avi Kivity wrote:
   This cleans up msix/kvm integration a bit.  The really important patch is 
  the
   last one, which allows msix.o to be part of non-target-specific build.
 
 I actually thoought this later move should be done in a different way:
 - add all functions msix uses to kvm-stub.c
 
 Isn't that what I did?
 
 - kvm_irq_routing_entry should also have a stub
 
 I sent some minor comments in case you have a reason
 to prefer this way.
 
 My motivation is really the last patch.  If you explain what you'd
 like to see I'll try to do it.

Basically my idea was to avoid all ifdefs in msix.c
*without changing it*, by stubbing out kvm APIs
and structures we use there.

-- 
MST
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM: cpu_relax() during spin waiting for reboot

2010-09-21 Thread Avi Kivity
It doesn't really matter, but if we spin, we should spin in a more relaxed
manner.  This way, if something goes wrong at least it won't contribute to
global warming.

Signed-off-by: Avi Kivity a...@redhat.com
---
 virt/kvm/kvm_main.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c7a57b4..b8499f5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2022,7 +2022,7 @@ asmlinkage void kvm_handle_fault_on_reboot(void)
/* spin while reset goes on */
local_irq_enable();
while (true)
-   ;
+   cpu_relax();
}
/* Fault while not rebooting.  We want the trace. */
BUG();
-- 
1.7.2.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Fix reboot on Intel hosts

2010-09-21 Thread Avi Kivity
For a while (how long?) reboots with active guests are broken on Intel hosts.
This patch set fixes the problem.

Avi Kivity (2):
  KVM: Fix reboot on Intel hosts
  KVM: cpu_relax() during spin waiting for reboot

 virt/kvm/kvm_main.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

-- 
1.7.2.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] KVM: Fix reboot on Intel hosts

2010-09-21 Thread Avi Kivity
When we reboot, we disable vmx extensions or otherwise INIT gets blocked.
If a task on another cpu hits a vmx instruction, it will fault if vmx is
disabled.  We trap that to avoid a nasty oops and spin until the reboot
completes.

Problem is, we sleep with interrupts disabled.  This blocks smp_send_stop()
from running, and the reboot process halts.

Fix by enabling interrupts before spinning.

KVM-Stable-Tag.
Signed-off-by: Avi Kivity a...@redhat.com
---
 virt/kvm/kvm_main.c |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 9a73b98..c7a57b4 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2018,10 +2018,12 @@ static int kvm_cpu_hotplug(struct notifier_block 
*notifier, unsigned long val,
 
 asmlinkage void kvm_handle_fault_on_reboot(void)
 {
-   if (kvm_rebooting)
+   if (kvm_rebooting) {
/* spin while reset goes on */
+   local_irq_enable();
while (true)
;
+   }
/* Fault while not rebooting.  We want the trace. */
BUG();
 }
-- 
1.7.2.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM call minutes for Sept 21

2010-09-21 Thread Chris Wright
Nested VMX
- looking for forward progress and better collaboration between the
  Intel and IBM teams
- needs more review (not a new issue)
- use cases
- work todo
  - merge baseline patch
- looks pretty good
- review is finding mostly small things at this point
- need some correctness verification (both review from Intel and testing)
  - need a test suite
- test suite harness will help here
  - a few dozen nested SVM tests are there, can follow for nested VMX
  - nested EPT
  - optimize (reduce vmreads and vmwrites)
- has long term maintan

Hotplug
- command...guest may or may not respond
- guest can't be trusted to be direct part of request/response loop
- solve at QMP level
- human monitor issues (multiple successive commands to complete a
  single unplug)
  - should be a GUI interface design decision, human monitor is not a
good design point
- digression into GUI interface

Drive caching
- need to formalize the meanings in terms of data integrity guarantees
- guest write cache (does it directly reflect the host write cache?)
  - live migration, underlying block dev changes, so need to decouple the two
- O_DIRECT + O_DSYNC
  - O_DSYNC needed based on whether disk cache is available
  - also issues with sparse files (e.g. O_DIRECT to unallocated extent)
  - how to manage w/out needing to flush every write, slow
- perhaps start with O_DIRECT on raw, non-sparse files only?
- backend needs to open backing store matching to guests disk cache state
- O_DIRECT itself has inconsistent integrity guarantees
  - works well with fully allocated file, depedent on disk cache disable
(or fs specific flushing)
- filesystem specific warnings (ext4 w/ barriers on, brtfs)
- need to be able to open w/ O_DSYNC depending on guets's write cache mode
- make write cache visible to guest (need a knob for this)
- qemu default is cache=writethrough, do we need to revisit that?
- just present user with option whether or not to use host page cache
- allow guest OS to choose disk write cache setting
  - set up host backend accordingly
- be nice preserve write cache settings over boot (outgrowing cmos storage)
- maybe some host fs-level optimization possible
  - e.g. O_DSYNC to allocated O_DIRECT extent becomes no-op
- conclusion
  - one direct user tunable, use host page cache or not
  - one guest OS tunable, enable disk cache
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM timekeeping fixes 4/4] TSC catchup mode

2010-09-21 Thread Marcelo Tosatti
On Mon, Sep 20, 2010 at 03:11:30PM -1000, Zachary Amsden wrote:
 On 09/20/2010 05:38 AM, Marcelo Tosatti wrote:
 On Sat, Sep 18, 2010 at 02:38:15PM -1000, Zachary Amsden wrote:
 Negate the effects of AN TYM spell while kvm thread is preempted by tracking
 conversion factor to the highest TSC rate and catching the TSC up when it 
 has
 fallen behind the kernel view of time.  Note that once triggered, we don't
 turn off catchup mode.
 
 A slightly more clever version of this is possible, which only does catchup
 when TSC rate drops, and which specifically targets only CPUs with broken
 TSC, but since these all are considered unstable_tsc(), this patch covers
 all necessary cases.
 
 Signed-off-by: Zachary Amsdenzams...@redhat.com
 ---
   arch/x86/include/asm/kvm_host.h |6 +++
   arch/x86/kvm/x86.c  |   87 
  +-
   2 files changed, 72 insertions(+), 21 deletions(-)
 
 diff --git a/arch/x86/include/asm/kvm_host.h 
 b/arch/x86/include/asm/kvm_host.h
 index 8c5779d..e209078 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -384,6 +384,9 @@ struct kvm_vcpu_arch {
 u64 last_host_tsc;
 u64 last_guest_tsc;
 u64 last_kernel_ns;
 +   u64 last_tsc_nsec;
 +   u64 last_tsc_write;
 +   bool tsc_catchup;
 
 bool nmi_pending;
 bool nmi_injected;
 @@ -444,6 +447,9 @@ struct kvm_arch {
 u64 last_tsc_nsec;
 u64 last_tsc_offset;
 u64 last_tsc_write;
 +   u32 virtual_tsc_khz;
 +   u32 virtual_tsc_mult;
 +   s8 virtual_tsc_shift;
 
 struct kvm_xen_hvm_config xen_hvm_config;
 
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 09f468a..9152156 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -962,6 +962,7 @@ static inline u64 get_kernel_ns(void)
   }
 
   static DEFINE_PER_CPU(unsigned long, cpu_tsc_khz);
 +unsigned long max_tsc_khz;
 
   static inline int kvm_tsc_changes_freq(void)
   {
 @@ -985,6 +986,24 @@ static inline u64 nsec_to_cycles(u64 nsec)
 return ret;
   }
 
 +static void kvm_arch_set_tsc_khz(struct kvm *kvm, u32 this_tsc_khz)
 +{
 +   /* Compute a scale to convert nanoseconds in TSC cycles */
 +   kvm_get_time_scale(this_tsc_khz, NSEC_PER_SEC / 1000,
 +   kvm-arch.virtual_tsc_shift,
 +   kvm-arch.virtual_tsc_mult);
 +   kvm-arch.virtual_tsc_khz = this_tsc_khz;
 +}
 +
 +static u64 compute_guest_tsc(struct kvm_vcpu *vcpu, s64 kernel_ns)
 +{
 +   u64 tsc = pvclock_scale_delta(kernel_ns-vcpu-arch.last_tsc_nsec,
 + vcpu-kvm-arch.virtual_tsc_mult,
 + vcpu-kvm-arch.virtual_tsc_shift);
 +   tsc += vcpu-arch.last_tsc_write;
 +   return tsc;
 +}
 +
   void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data)
   {
 struct kvm *kvm = vcpu-kvm;
 @@ -1029,6 +1048,8 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data)
 
 /* Reset of TSC must disable overshoot protection below */
 vcpu-arch.hv_clock.tsc_timestamp = 0;
 +   vcpu-arch.last_tsc_write = data;
 +   vcpu-arch.last_tsc_nsec = ns;
   }
   EXPORT_SYMBOL_GPL(kvm_write_tsc);
 
 @@ -1041,22 +1062,42 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
 s64 kernel_ns, max_kernel_ns;
 u64 tsc_timestamp;
 
 -   if ((!vcpu-time_page))
 -   return 0;
 -
 /* Keep irq disabled to prevent changes to the clock */
 local_irq_save(flags);
 kvm_get_msr(v, MSR_IA32_TSC,tsc_timestamp);
 kernel_ns = get_kernel_ns();
 this_tsc_khz = __get_cpu_var(cpu_tsc_khz);
 -   local_irq_restore(flags);
 
 if (unlikely(this_tsc_khz == 0)) {
 +   local_irq_restore(flags);
 kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
 return 1;
 }
 
 /*
 +* We may have to catch up the TSC to match elapsed wall clock
 +* time for two reasons, even if kvmclock is used.
 +*   1) CPU could have been running below the maximum TSC rate
 kvmclock handles frequency changes?
 
 +*   2) Broken TSC compensation resets the base at each VCPU
 +*  entry to avoid unknown leaps of TSC even when running
 +*  again on the same CPU.  This may cause apparent elapsed
 +*  time to disappear, and the guest to stand still or run
 +*  very slowly.
 I don't get this. Please explain.
 
 This compensation in arch_vcpu_load, for unstable TSC case, causes
 time while preempted to disappear from the TSC by adjusting the TSC
 back to match the last observed TSC.
 
 if (unlikely(vcpu-cpu != cpu) || check_tsc_unstable()) {
 /* Make sure TSC doesn't go backwards */
 s64 tsc_delta = !vcpu-arch.last_host_tsc ? 0 :
 native_read_tsc() -
 vcpu-arch.last_host_tsc;
 if (tsc_delta  0)
 mark_tsc_unstable(KVM discovered backwards TSC);
 if (check_tsc_unstable())
 kvm_x86_ops-adjust_tsc_offset(vcpu,
 -tsc_delta); 
 
 Note that this is the 

Re: KVM call minutes for Sept 21

2010-09-21 Thread Anthony Liguori

On 09/21/2010 01:05 PM, Chris Wright wrote:

Nested VMX
- looking for forward progress and better collaboration between the
   Intel and IBM teams
- needs more review (not a new issue)
- use cases
- work todo
   - merge baseline patch
 - looks pretty good
 - review is finding mostly small things at this point
 - need some correctness verification (both review from Intel and testing)
   - need a test suite
 - test suite harness will help here
   - a few dozen nested SVM tests are there, can follow for nested VMX
   - nested EPT
   - optimize (reduce vmreads and vmwrites)
- has long term maintan

Hotplug
- command...guest may or may not respond
- guest can't be trusted to be direct part of request/response loop
- solve at QMP level
- human monitor issues (multiple successive commands to complete a
   single unplug)
   - should be a GUI interface design decision, human monitor is not a
 good design point
 - digression into GUI interface
   


The way this works IRL is:

1) Administrator presses a physical button.  This sends an ACPI 
notification to the guest.


2) The guest makes a decision about how to handle APCI notification.

3) To initiate unplug, the guest disables the device and performs an 
operation to indicate to the PCI bus that the device is unloaded.


4) Step (3) causes an LED (usually near the button in 1) to change colors

5) Administrator then physically removes the device.

So we need at least a QMP command to perform step (1).  Since (3) can 
occur independently of (1), it should be an async notification.  
device_del should only perform step (5).


A management tool needs to:

pci_unplug_request slot
/* wait for PCI_UNPLUGGED event */
device_del slot
netdev_del backend


Drive caching
- need to formalize the meanings in terms of data integrity guarantees
- guest write cache (does it directly reflect the host write cache?)
   - live migration, underlying block dev changes, so need to decouple the two
- O_DIRECT + O_DSYNC
   - O_DSYNC needed based on whether disk cache is available
   - also issues with sparse files (e.g. O_DIRECT to unallocated extent)
   - how to manage w/out needing to flush every write, slow
- perhaps start with O_DIRECT on raw, non-sparse files only?
- backend needs to open backing store matching to guests disk cache state
- O_DIRECT itself has inconsistent integrity guarantees
   - works well with fully allocated file, depedent on disk cache disable
 (or fs specific flushing)
- filesystem specific warnings (ext4 w/ barriers on, brtfs)
- need to be able to open w/ O_DSYNC depending on guets's write cache mode
- make write cache visible to guest (need a knob for this)
- qemu default is cache=writethrough, do we need to revisit that?
- just present user with option whether or not to use host page cache
- allow guest OS to choose disk write cache setting
   - set up host backend accordingly
- be nice preserve write cache settings over boot (outgrowing cmos storage)
- maybe some host fs-level optimization possible
   - e.g. O_DSYNC to allocated O_DIRECT extent becomes no-op
- conclusion
   - one direct user tunable, use host page cache or not
   - one guest OS tunable, enable disk cache
   


IOW, a qdev 'write-cache=on|off' property and a blockdev 'direct=on|off' 
property.  For completeness, a blockdev 'unsafe=on|off' property.


Open flags are:

write-cache=on, direct=onO_DIRECT
write-cache=off, direct=onO_DIRECT | O_DSYNC
write-cache=on, direct=off0
write-cache=off, direct=offO_DSYNC

It's still unclear what our default mode will be.

The problem is, O_DSYNC has terrible performance on ext4 when barrier=1.

write-cache=on,direct=off is a bad default because if you do a simple 
performance test, you'll get better than native and that upsets people.


write-cache=off,direct=off is a bad default because ext4's default 
config sucks with this.


likewise, write-cache=off, direct=on is a bad default for the same reason.

Regards,

Anthonny Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] virtio-blk: put request that was created to retrieve the device id

2010-09-21 Thread Christoph Hellwig
On Fri, Sep 17, 2010 at 09:58:48AM -0500, Ryan Harper wrote:
 Since __bio_map_kern() sets up bio-bi_end_io = bio_map_kern_endio
 (which does a bio_put(bio)) doesn't that ensure we don't leak?

Indeed, that should take care of it.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


USB Host Passthrough BSOD on Windows XP

2010-09-21 Thread mattia.martine...@gmail.com
Hi.
I installed a KVM virtual machine with Windows XP SP3 installed on it
with all updates from Windows Update.
I setted up an USB device from the host machine to be used on the
virtual machine with the command

qm set 107 -hostusb 2040:7070

The USB device is an Hauppauge WinTV Nova-T Stick DVB-T USB adapter.

Windows recognises the hardware and correctly install its drivers, but
when I try to use it (for example tuning some channels) I get the
following Blue Screen Of Death:

DRIVER_IRQL_NOT_LESS_OR_EQUAL
*** STOP: 0x00D1 (0x048C4C04, 0x0002, 0x0001, 0xBA392FD3)
*** usbuhci.sys - Address BA392FD3 base at BA39, DateStamp 480254ce

Windows' Minudump files tell that the problem is from the usbuhci.sys driver.

I'm using Proxmox VE 1.6 (the latest version) with the 2.6.32-2-pve
kernel version.

Do you have any hint?

Thank you very much for your help!
Bye.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call minutes for Sept 21

2010-09-21 Thread Nadav Har'El
Hi, thanks for the summary.
I also listened-in on the call. I'm glad these issues are being discussed.

On Tue, Sep 21, 2010, Chris Wright wrote about KVM call minutes for Sept 21:
 Nested VMX
 - looking for forward progress and better collaboration between the
   Intel and IBM teams

I'll be very happy if anyone, be it from Intel or somewhere else, would like
to help me work on nested VMX.

Somebody (I don't recognize your voices yet, sorry...) mentioned on the call
that there might not be much point in cooperation before I finish getting
nested VMX merged into KVM. I agree, but my conclusion is different that what
I think the speaker implied: My conclusion is that it is important that we
merge the nested VMX code into KVM as soon as possible, because if nested VMX
is part of KVM (and not a set of patches which becomes stale the moment after
I release it) this will make it much easier for people to test it, use it,
and cooperate in developing it.

 - needs more review (not a new issue)

I think the reviews that nested VMX has received over the past year (thanks
to Avi Kivity, Gleb Natapov, Eddie Dong and sometimes others), have been
fantastic. You guys have shown deep understanding of the code, and found
numerous bugs, oversights, missing features, and also a fair share of ugly
code, and we (first Orit and Abel, and then I) have done are best to fix all
of these issues. I've personally learned a lot from the latest round of
reviews, and the discussions with you.

So I don't think there has been any lack of reviews. I don't think that
getting more reviews is the most important task ahead of us.

Surely, if more people review the code, more potential bugs will be spotted.
But this is always the case, with any software. I think the question now
is, what would it take to finally declare the code as good enough to be
merged, with the understanding that even after being merged it will still be
considered an experimental feature, disabled by default and documented as
experimental. Nested SVM was also merged before it was perfect, and also
KVM itself was released before being perfect :-)

 - use cases

I don't kid myself that as soon as nested VMX is available in KVM, millions
of users worldwide will flock to use it. Definitely, many KVM users will never
find a need for nested virtualization. But I do believe that there are many
use cases. We outlined some of them in our paper (to be presented in a couple
of weeks in OSDI):

  1. Hosting one of the new breed of operating systems which have a hypervisor
 as part of them. Windows 7 with XP mode is one example. Linux with KVM
 is another.

  2. Platforms with embedded hypervisors in firmware need nested virt to
 run any workload - which can itself be a hypervisor with guests.

  3. Clouds users could put in their virtual machine a hypervisor with
 sub-guests, and run multiple virtual machines on the one virtual machine
 which they get.

  4. Enable live migration of entire hypervisors with their guests - for
 load balancing, disaster recovery, and so on.

  5. Honeypots and protection against hypervisor-level rootkits

  6. Make it easier to test, demonstrate, benchmark and debug hypervisors,
 and also entire virtualization setups. An entire virtualization setup
 (hypervisor and all its guests) could be run as one virtual machine,
 allowing testing many such setups on one physical machine.

By the way, I find the question of why do we need nested VMX a bit odd,
seeing that KVM already supports nested virtualization (for SVM). Is it the
case that nested virtualization was found useful on AMD processors, but for
Intel processors, it isn't? Of course not :-) I think KVM should support
nested virtualization on neither architecture, or on both - and of course
I think it should be on both :-)

 - work todo
   - merge baseline patch
 - looks pretty good
 - review is finding mostly small things at this point
 - need some correctness verification (both review from Intel and testing)
   - need a test suite
 - test suite harness will help here
   - a few dozen nested SVM tests are there, can follow for nested VMX
   - nested EPT

I've been keeping track of the issues remaining from the last review, and
indeed only a few remain. Only 8 of the 24 patches have any outstanding
issue, and I'm working on those that remain, as you could see on the mailing
list in the last couple of weeks. If there's interest, I can even summarize
these remaing issues.

But since I'm working on these patches alone, I think we need to define our
priorities. Most of the outstanding review comments, while absolutely correct
(and I was amazed by the quality of the reviewer's comments), deal with
re-writing code that already works (to improve its style) or fixing relatively
rare cases. It is not clear that these issues are more important than the
other things listed in the summary above (test suite, nested EPT), but as
long as I continue to rewrite 

buildbot failure in qemu-kvm on default_i386_debian_5_0

2010-09-21 Thread qemu-kvm
The Buildbot has detected a new failure of default_i386_debian_5_0 on qemu-kvm.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu-kvm/builders/default_i386_debian_5_0/builds/573

Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/

Buildslave for this Build: b1_qemu_kvm_2

Build Reason: The Nightly scheduler named 'nightly_default' triggered this build
Build Source Stamp: [branch master] HEAD
Blamelist: 

BUILD FAILED: failed compile

sincerely,
 -The Buildbot

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


buildbot failure in qemu-kvm on default_i386_out_of_tree

2010-09-21 Thread qemu-kvm
The Buildbot has detected a new failure of default_i386_out_of_tree on qemu-kvm.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu-kvm/builders/default_i386_out_of_tree/builds/510

Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/

Buildslave for this Build: b1_qemu_kvm_2

Build Reason: The Nightly scheduler named 'nightly_default' triggered this build
Build Source Stamp: [branch master] HEAD
Blamelist: 

BUILD FAILED: failed compile

sincerely,
 -The Buildbot

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KVM call minutes for Sept 21

2010-09-21 Thread Chris Wright
* Nadav Har'El (n...@math.technion.ac.il) wrote:
 On Tue, Sep 21, 2010, Chris Wright wrote about KVM call minutes for Sept 21:
  Nested VMX
  - looking for forward progress and better collaboration between the
Intel and IBM teams
 
 I'll be very happy if anyone, be it from Intel or somewhere else, would like
 to help me work on nested VMX.
 
 Somebody (I don't recognize your voices yet, sorry...) mentioned on the call
 that there might not be much point in cooperation before I finish getting
 nested VMX merged into KVM.

My recollection...it was Avi.

 I agree, but my conclusion is different that what
 I think the speaker implied: My conclusion is that it is important that we
 merge the nested VMX code into KVM as soon as possible, because if nested VMX
 is part of KVM (and not a set of patches which becomes stale the moment after
 I release it) this will make it much easier for people to test it, use it,
 and cooperate in developing it.

Yup.  And especially for follow-on work (like nested EPT).  Makes sense
to merge and build from merged base rather than have out-of-tree patchset
continue to grow and grow.

  - needs more review (not a new issue)
 
 I think the reviews that nested VMX has received over the past year (thanks
 to Avi Kivity, Gleb Natapov, Eddie Dong and sometimes others), have been
 fantastic. You guys have shown deep understanding of the code, and found
 numerous bugs, oversights, missing features, and also a fair share of ugly
 code, and we (first Orit and Abel, and then I) have done are best to fix all
 of these issues. I've personally learned a lot from the latest round of
 reviews, and the discussions with you.
 
 So I don't think there has been any lack of reviews. I don't think that
 getting more reviews is the most important task ahead of us.

At earlier points of review there were issues considered fundamental
that needed to be fixed before merging (SMP and proper VMPTRLD emulation
springs to mind).  Now it seems it's down to smaller, more targetted
issues.  Some hesitancy is based on the complexity of the patches.
So more review helps...test harness does too.  Anything to build Avi's
confidence to merging the code ;)

 Surely, if more people review the code, more potential bugs will be spotted.
 But this is always the case, with any software. I think the question now
 is, what would it take to finally declare the code as good enough to be
 merged, with the understanding that even after being merged it will still be
 considered an experimental feature, disabled by default and documented as
 experimental. Nested SVM was also merged before it was perfect, and also
 KVM itself was released before being perfect :-)

;)

  - use cases
 
 I don't kid myself that as soon as nested VMX is available in KVM, millions
 of users worldwide will flock to use it. Definitely, many KVM users will never
 find a need for nested virtualization. But I do believe that there are many
 use cases. We outlined some of them in our paper (to be presented in a couple
 of weeks in OSDI):
 
   1. Hosting one of the new breed of operating systems which have a hypervisor
  as part of them. Windows 7 with XP mode is one example. Linux with KVM
  is another.
 
   2. Platforms with embedded hypervisors in firmware need nested virt to
  run any workload - which can itself be a hypervisor with guests.
 
   3. Clouds users could put in their virtual machine a hypervisor with
  sub-guests, and run multiple virtual machines on the one virtual machine
  which they get.
 
   4. Enable live migration of entire hypervisors with their guests - for
  load balancing, disaster recovery, and so on.
 
   5. Honeypots and protection against hypervisor-level rootkits
 
   6. Make it easier to test, demonstrate, benchmark and debug hypervisors,
  and also entire virtualization setups. An entire virtualization setup
  (hypervisor and all its guests) could be run as one virtual machine,
  allowing testing many such setups on one physical machine.
 
 By the way, I find the question of why do we need nested VMX a bit odd,
 seeing that KVM already supports nested virtualization (for SVM). Is it the
 case that nested virtualization was found useful on AMD processors, but for
 Intel processors, it isn't? Of course not :-) I think KVM should support
 nested virtualization on neither architecture, or on both - and of course
 I think it should be on both :-)

People keep looking for reasons to justify the cost of the effort, dunno
why because it's cool isn't good enough ;)  At any rate, that was mainly
a question of how it might be useful for production kind of environments.

  - work todo
- merge baseline patch
  - looks pretty good
  - review is finding mostly small things at this point
  - need some correctness verification (both review from Intel and 
  testing)
- need a test suite
  - test suite harness will help here
- a few dozen nested SVM tests are there, can