Re: [Qemu-devel] [PATCH] Add definitions for current cpu models..

2010-01-25 Thread Dor Laor

On 01/21/2010 05:05 PM, Anthony Liguori wrote:

On 01/20/2010 07:18 PM, john cooper wrote:

Chris Wright wrote:

* Daniel P. Berrange (berra...@redhat.com) wrote:

To be honest all possible naming schemes for '-cpuname' are just as
unfriendly as each other. The only user friendly option is '-cpu host'.

IMHO, we should just pick a concise naming scheme document it. Given
they are all equally unfriendly, the one that has consistency with
vmware
naming seems like a mild winner.

Heh, I completely agree, and was just saying the same thing to John
earlier today. May as well be -cpu {foo,bar,baz} since the meaning for
those command line options must be well-documented in the man page.

I can appreciate the concern of wanting to get this
as correct as possible.


This is the root of the trouble. At the qemu layer, we try to focus on
being correct.

Management tools are typically the layer that deals with being correct.

A good compromise is making things user tunable which means that a
downstream can make correctness decisions without forcing those
decisions on upstream.

In this case, the idea would be to introduce a new option, say something
like -cpu-def. The syntax would be:

-cpu-def
name=coreduo,level=10,family=6,model=14,stepping=8,features=+vme+mtrr+clflush+mca+sse3+monitor,xlevel=0x8008,model_id=Genuine
Intel(R) CPU T2600 @ 2.16GHz

Which is not that exciting since it just lets you do -cpu coreduo in a
much more complex way. However, if we take advantage of the current
config support, you can have:

[cpu-def]
name=coreduo
level=10
family=6
model=14
stepping=8
features=+vme+mtrr+clflush+mca+sse3..
model_id=Genuine Intel...

And that can be stored in a config file. We should then parse
/etc/qemu/target-targetname.conf by default. We'll move the current
x86_defs table into this config file and then downstreams/users can
define whatever compatibility classes they want.

With this feature, I'd be inclined to take correct compatibility
classes like Nehalem as part of the default qemurc that we install
because it's easily overridden by a user. It then becomes just a
suggestion on our part verses a guarantee.

It should just be a matter of adding qemu_cpudefs_opts to
qemu-config.[ch], taking a new command line that parses the argument via
QemuOpts, then passing the parsed options to a target-specific function
that then builds the table of supported cpus.


Isn't the outcome of John's patches and these configs will be exactly 
the same? Since these cpu models won't ever change, there is no reason 
why not to hard code them. Adding configs or command lines is a good 
idea but it is more friendlier to have basic support to the common cpus.

This is why qemu today offers: -cpu ?
x86   qemu64
x86   phenom
x86 core2duo
x86kvm64
x86   qemu32
x86  coreduo
x86  486
x86  pentium
x86 pentium2
x86 pentium3
x86   athlon
x86 n270

So bottom line, my point is to have John's base + your configs. We need 
to keep also the check verb and the migration support for sending those.


btw: IMO we should deal with this complexity ourselves and save 99.9% of 
the users the need to define such models, don't ask this from a java 
programmer, he is running on a JVM :-)





Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2939216 ] Yellow bang on Win2008 when loading a network virtual func

2010-01-25 Thread SourceForge.net
Bugs item #2939216, was opened at 2010-01-25 11:48
Message generated for change (Tracker Item Submitted) made by llevy
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2939216group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: qemu
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Lior Levy (llevy)
Assigned to: Nobody/Anonymous (nobody)
Summary: Yellow bang on Win2008 when loading a network virtual func

Initial Comment:
A network virtual device cannot load on Windows 2008. The network card is: 
Intel Corporation 82576 Gigabit Network Connection
Getting an error of - The device cannot find enough free resources  that it 
can use (Code 12).

System information:
- CPU - Intel(R) Xeon(R) CPU (2 sockets of quad core)
- Host OS - RHEL 5.4 - Linux 2.6.18-160.el5 #1 SMP Mon Jul 27 17:28:29 EDT 2009 
x86_64 x86_64 x86_64 GNU/Linux
- The issue happens also on a newer kernel - 2.6.30.2
- Guest OS - Windows 2008 - Windows Server(R) Enterprise SP1 - 32 bit
- KVM info 
filename:   /lib/modules/2.6.18-160.el5/weak-updates/kmod-kvm/kvm.ko
version:kvm-83-101.el5
srcversion: 45FED1544C648ADF0C59E7E
vermagic:   2.6.18-159.el5 SMP mod_unload gcc-4.1
- QEMU info - QEMU PC emulator version 0.12.1 (qemu-kvm-0.12.1), Copyright (c) 
2003-2008 Fabrice Bellard

Steps to reproduce:
(1) Put a network card -- Intel Corporation 82576 Gigabit Network Connection -- 
into the system
(2) Load igb driver with flag to load a virtual function as well.
 insmod ./igb.ko max_vfs=2,2
(3) upload the interface
 ifconfig ethX up
(4) Ensure that a VF exists in PCI enumeration
 lspci | grep Eth
You should see a new virtual device like: 01:10.0 Ethernet controller: Intel 
Corporation 82576 Virtual Function (rev 01)
(5) Load the guest VM with this VF
 qemu-system-x86_64 win2008.qcow2 -m 2048 -pcidevice host=01:10.0 -net none 
(6) Install the latest Intel drivers for Win2008
Install application can be found here: 
http://downloadcenter.intel.com/Detail_Desc.aspx?agr=YDwnldID=12197lang=eng
(7) A Network device will appear in Device manager. Intel(R)Virtual Network 
Connection. But wirth yellow bang.

Notes:
When loading RHEL5.4 as a guest OS with the same flow, the device works as 
expected.
The problem still exists when using the -no-kvm-irqchip or -no-kvm-pit switch.
When using -no-kvm switch, the device is not seen in Windows device manager

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2939216group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM: mark segments accessed on HW task switch

2010-01-25 Thread Gleb Natapov
On HW task switch newly loaded segments should me marked as accessed.

Reported-by: Lorenzo Martignoni martig...@gmail.com
Signed-off-by: Gleb Natapov g...@redhat.com
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 47c6e23..b5a2a88 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4708,18 +4708,6 @@ static u16 get_segment_selector(struct kvm_vcpu *vcpu, 
int seg)
return kvm_seg.selector;
 }
 
-static int load_segment_descriptor_to_kvm_desct(struct kvm_vcpu *vcpu,
-   u16 selector,
-   struct kvm_segment *kvm_seg)
-{
-   struct desc_struct seg_desc;
-
-   if (load_guest_segment_descriptor(vcpu, selector, seg_desc))
-   return 1;
-   seg_desct_to_kvm_desct(seg_desc, selector, kvm_seg);
-   return 0;
-}
-
 static int kvm_load_realmode_segment(struct kvm_vcpu *vcpu, u16 selector, int 
seg)
 {
struct kvm_segment segvar = {
@@ -4760,11 +4748,14 @@ int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, 
u16 selector,
int type_bits, int seg)
 {
struct kvm_segment kvm_seg;
+   struct desc_struct seg_desc;
 
if (is_vm86_segment(vcpu, seg) || !(kvm_read_cr0_bits(vcpu, 
X86_CR0_PE)))
return kvm_load_realmode_segment(vcpu, selector, seg);
-   if (load_segment_descriptor_to_kvm_desct(vcpu, selector, kvm_seg))
+
+   if (load_guest_segment_descriptor(vcpu, selector, seg_desc))
return 1;
+   seg_desct_to_kvm_desct(seg_desc, selector, kvm_seg);
 
kvm_check_segment_descriptor(vcpu, seg, selector);
kvm_seg.type |= type_bits;
@@ -4775,6 +4766,11 @@ int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, 
u16 selector,
kvm_seg.unusable = 1;
 
kvm_set_segment(vcpu, kvm_seg, seg);
+   if (selector  !kvm_seg.unusable  kvm_seg.s) {
+   /* mark segment as accessed */
+   seg_desc.type |= 1;
+   save_guest_segment_descriptor(vcpu, selector, seg_desc);
+   }
return 0;
 }
 
--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: mark segments accessed on HW task switch

2010-01-25 Thread Avi Kivity

On 01/25/2010 12:01 PM, Gleb Natapov wrote:

On HW task switch newly loaded segments should me marked as accessed.

@@ -4775,6 +4766,11 @@ int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, 
u16 selector,
kvm_seg.unusable = 1;

kvm_set_segment(vcpu,kvm_seg, seg);
+   if (selector  !kvm_seg.unusable  kvm_seg.s) {
+   /* mark segment as accessed */
+   seg_desc.type |= 1;
+   save_guest_segment_descriptor(vcpu, selector,seg_desc);
+   }
return 0;
  }
   


What about an error return from s_g_s_d?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: mark segments accessed on HW task switch

2010-01-25 Thread Gleb Natapov
On Mon, Jan 25, 2010 at 01:08:13PM +0200, Avi Kivity wrote:
 On 01/25/2010 12:01 PM, Gleb Natapov wrote:
 On HW task switch newly loaded segments should me marked as accessed.
 
 @@ -4775,6 +4766,11 @@ int kvm_load_segment_descriptor(struct kvm_vcpu 
 *vcpu, u16 selector,
  kvm_seg.unusable = 1;
 
  kvm_set_segment(vcpu,kvm_seg, seg);
 +if (selector  !kvm_seg.unusable  kvm_seg.s) {
 +/* mark segment as accessed */
 +seg_desc.type |= 1;
 +save_guest_segment_descriptor(vcpu, selector,seg_desc);
 +}
  return 0;
   }
 
 What about an error return from s_g_s_d?
 
What can or should we do about it?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: mark segments accessed on HW task switch

2010-01-25 Thread Avi Kivity

On 01/25/2010 01:11 PM, Gleb Natapov wrote:

On Mon, Jan 25, 2010 at 01:08:13PM +0200, Avi Kivity wrote:
   

On 01/25/2010 12:01 PM, Gleb Natapov wrote:
 

On HW task switch newly loaded segments should me marked as accessed.

@@ -4775,6 +4766,11 @@ int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, 
u16 selector,
kvm_seg.unusable = 1;

kvm_set_segment(vcpu,kvm_seg, seg);
+   if (selector   !kvm_seg.unusable   kvm_seg.s) {
+   /* mark segment as accessed */
+   seg_desc.type |= 1;
+   save_guest_segment_descriptor(vcpu, selector,seg_desc);
+   }
return 0;
  }
   

What about an error return from s_g_s_d?

 

What can or should we do about it?

   


If -EFAULT, propagate to userspace.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Add definitions for current cpu models..

2010-01-25 Thread Jamie Lokier
Dor Laor wrote:
 x86   qemu64
 x86   phenom
 x86 core2duo
 x86kvm64
 x86   qemu32
 x86  coreduo
 x86  486
 x86  pentium
 x86 pentium2
 x86 pentium3
 x86   athlon
 x86 n270

I wonder if kvm32 would be good, for symmetry if nothing else.

-- Jamie
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] kvmppc/booke: Set ESR and DEAR when inject interrupt to guest

2010-01-25 Thread Liu Yu-B13201
 

 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de] 
 Sent: Friday, January 22, 2010 7:33 PM
 To: Liu Yu-B13201
 Cc: hol...@penguinppc.org; kvm-...@vger.kernel.org; 
 kvm@vger.kernel.org
 Subject: Re: [PATCH] kvmppc/booke: Set ESR and DEAR when 
 inject interrupt to guest
 
 
 On 22.01.2010, at 12:27, Liu Yu-B13201 wrote:
 
  
  
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org 
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf
  Sent: Friday, January 22, 2010 7:13 PM
  To: Liu Yu-B13201
  Cc: hol...@penguinppc.org; kvm-...@vger.kernel.org; 
  kvm@vger.kernel.org
  Subject: Re: [PATCH] kvmppc/booke: Set ESR and DEAR when 
  inject interrupt to guest
  
  
  On 22.01.2010, at 11:54, Liu Yu wrote:
  
  Old method prematurely sets ESR and DEAR.
  Move this part after we decide to inject interrupt,
  and make it more like hardware behave.
  
  Signed-off-by: Liu Yu yu@freescale.com
  ---
  arch/powerpc/kvm/booke.c   |   24 ++--
  arch/powerpc/kvm/emulate.c |2 --
  2 files changed, 14 insertions(+), 12 deletions(-)
  
  @@ -286,15 +295,12 @@ int kvmppc_handle_exit(struct kvm_run 
  *run, struct kvm_vcpu *vcpu,
break;
  
case BOOKE_INTERRUPT_DATA_STORAGE:
  - vcpu-arch.dear = vcpu-arch.fault_dear;
  - vcpu-arch.esr = vcpu-arch.fault_esr;
kvmppc_booke_queue_irqprio(vcpu, 
  BOOKE_IRQPRIO_DATA_STORAGE);
  
  kvmppc_booke_queue_data_storage(vcpu, vcpu-arch.fault_esr, 
  vcpu-arch.fault_dear);
  
kvmppc_account_exit(vcpu, DSI_EXITS);
r = RESUME_GUEST;
break;
  
case BOOKE_INTERRUPT_INST_STORAGE:
  - vcpu-arch.esr = vcpu-arch.fault_esr;
kvmppc_booke_queue_irqprio(vcpu, 
  BOOKE_IRQPRIO_INST_STORAGE);
  
  kvmppc_booke_queue_inst_storage(vcpu, vcpu-arch.fault_esr);
  
  
  Not sure if this is redundant, as we already have fault_esr.
  Or should we ignore what hareware is and create a new esr to guest?
 
 On Book3S I take the SRR1 we get from the host as 
 inspiration of what to pass to the guest as SRR1. I think 
 we should definitely be able to inject a fault that we didn't 
 get in that exact form from the exit path.
 
 I'm also not sure if something could clobber fault_esr if 
 another interrupt takes precedence. Say a #MC.

No as far as I know.
And if yes, the clobber could as well happen before we copy it.
Hollis, what do you think we should do here?


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: mark segments accessed on HW task switch

2010-01-25 Thread Gleb Natapov
On Mon, Jan 25, 2010 at 01:12:36PM +0200, Avi Kivity wrote:
 On 01/25/2010 01:11 PM, Gleb Natapov wrote:
 On Mon, Jan 25, 2010 at 01:08:13PM +0200, Avi Kivity wrote:
 On 01/25/2010 12:01 PM, Gleb Natapov wrote:
 On HW task switch newly loaded segments should me marked as accessed.
 
 @@ -4775,6 +4766,11 @@ int kvm_load_segment_descriptor(struct kvm_vcpu 
 *vcpu, u16 selector,
kvm_seg.unusable = 1;
 
kvm_set_segment(vcpu,kvm_seg, seg);
 +  if (selector   !kvm_seg.unusable   kvm_seg.s) {
 +  /* mark segment as accessed */
 +  seg_desc.type |= 1;
 +  save_guest_segment_descriptor(vcpu, selector,seg_desc);
 +  }
return 0;
   }
 What about an error return from s_g_s_d?
 
 What can or should we do about it?
 
 
 If -EFAULT, propagate to userspace.
 
We don't handle it anywhere in task switch emulation. Separate patch?

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: mark segments accessed on HW task switch

2010-01-25 Thread Avi Kivity

On 01/25/2010 02:24 PM, Gleb Natapov wrote:

On Mon, Jan 25, 2010 at 01:12:36PM +0200, Avi Kivity wrote:
   

On 01/25/2010 01:11 PM, Gleb Natapov wrote:
 

On Mon, Jan 25, 2010 at 01:08:13PM +0200, Avi Kivity wrote:
   

On 01/25/2010 12:01 PM, Gleb Natapov wrote:
 

On HW task switch newly loaded segments should me marked as accessed.

@@ -4775,6 +4766,11 @@ int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, 
u16 selector,
kvm_seg.unusable = 1;

kvm_set_segment(vcpu,kvm_seg, seg);
+   if (selector!kvm_seg.unusablekvm_seg.s) {
+   /* mark segment as accessed */
+   seg_desc.type |= 1;
+   save_guest_segment_descriptor(vcpu, selector,seg_desc);
+   }
return 0;
  }
   

What about an error return from s_g_s_d?

 

What can or should we do about it?

   

If -EFAULT, propagate to userspace.

 

We don't handle it anywhere in task switch emulation. Separate patch?

   


Things like 'return kvm_write_guest_virt()' do handle it.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: VMX: Pass cr0.mp through to the guest when the fpu is active

2010-01-25 Thread Marcelo Tosatti
On Sun, Jan 24, 2010 at 04:26:40PM +0200, Avi Kivity wrote:
 When cr0.mp is clear, the guest doesn't expect a #NM in response to
 a WAIT instruction.  Because we always keep cr0.mp set, it will get
 a #NM, and potentially be confused.
 
 Fix by keeping cr0.mp set only when the fpu is inactive, and passing
 it through when inactive.
 
 Reported-by: Lorenzo Martignoni martig...@gmail.com
 Analyzed-by: Gleb Natapov g...@redhat.com
 Signed-off-by: Avi Kivity a...@redhat.com

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: VMX: Fix clts

2010-01-25 Thread Marcelo Tosatti
On Sun, Jan 24, 2010 at 12:17:23PM +0200, Avi Kivity wrote:
 The simplistic clts implementation has a couple of flaws:
 - kvm_read_cr0_bits() is temporarily unsynchronized when vcpu-arch.cr0 
 changes
 - if the fpu is active, we need to clear GUEST_CR0.TS, not just
   CR_READ_SHADOW.TS, so that we don't send the guest an unexpected #NM.
 
 Fix by replacing custom logic with a call to vmx_set_cr0(), which does the
 right thing, albeit less efficiently.
 
 Signed-off-by: Avi Kivity a...@redhat.com

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] RFC: alias rework

2010-01-25 Thread Izik Eidus
From f94dcd1ccabbcdb51ed7c37c5f58f00a5c1b7eec Mon Sep 17 00:00:00 2001
From: Izik Eidus iei...@redhat.com
Date: Mon, 25 Jan 2010 15:49:41 +0200
Subject: [PATCH] RFC: alias rework

This patch remove the old way of aliasing inside kvm
and move into using aliasing with the same virtual addresses

This patch is really just early RFC just to know if you guys
like this direction, and I need to clean some parts of it
and test it more before I feel it ready to be merged...

Comments are more than welcome.

Thanks.

Signed-off-by: Izik Eidus iei...@redhat.com
---
 arch/ia64/include/asm/kvm_host.h |1 +
 arch/ia64/kvm/kvm-ia64.c |5 --
 arch/powerpc/kvm/powerpc.c   |5 --
 arch/s390/include/asm/kvm_host.h |1 +
 arch/s390/kvm/kvm-s390.c |5 --
 arch/x86/include/asm/kvm_host.h  |   19 --
 arch/x86/include/asm/vmx.h   |6 +-
 arch/x86/kvm/mmu.c   |   19 ++-
 arch/x86/kvm/x86.c   |  114 +++--
 include/linux/kvm_host.h |   11 +--
 virt/kvm/kvm_main.c  |   80 +++---
 11 files changed, 107 insertions(+), 159 deletions(-)

diff --git a/arch/ia64/include/asm/kvm_host.h b/arch/ia64/include/asm/kvm_host.h
index a362e67..d5377c2 100644
--- a/arch/ia64/include/asm/kvm_host.h
+++ b/arch/ia64/include/asm/kvm_host.h
@@ -24,6 +24,7 @@
 #define __ASM_KVM_HOST_H
 
 #define KVM_MEMORY_SLOTS 32
+#define KVM_ALIAS_SLOTS 0
 /* memory slots that does not exposed to userspace */
 #define KVM_PRIVATE_MEM_SLOTS 4
 
diff --git a/arch/ia64/kvm/kvm-ia64.c b/arch/ia64/kvm/kvm-ia64.c
index 0618898..3d2559e 100644
--- a/arch/ia64/kvm/kvm-ia64.c
+++ b/arch/ia64/kvm/kvm-ia64.c
@@ -1947,11 +1947,6 @@ int kvm_cpu_has_pending_timer(struct kvm_vcpu *vcpu)
return vcpu-arch.timer_fired;
 }
 
-gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
-{
-   return gfn;
-}
-
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
 {
return (vcpu-arch.mp_state == KVM_MP_STATE_RUNNABLE) ||
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 51aedd7..50b7d5f 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -35,11 +35,6 @@
 #define CREATE_TRACE_POINTS
 #include trace.h
 
-gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
-{
-   return gfn;
-}
-
 int kvm_arch_vcpu_runnable(struct kvm_vcpu *v)
 {
return !(v-arch.msr  MSR_WE) || !!(v-arch.pending_exceptions);
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 27605b6..6a2112e 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -21,6 +21,7 @@
 
 #define KVM_MAX_VCPUS 64
 #define KVM_MEMORY_SLOTS 32
+#define KVM_ALIAS_SLOTS 0
 /* memory slots that does not exposed to userspace */
 #define KVM_PRIVATE_MEM_SLOTS 4
 
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 8f09959..5d63f6b 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -741,11 +741,6 @@ void kvm_arch_flush_shadow(struct kvm *kvm)
 {
 }
 
-gfn_t unalias_gfn(struct kvm *kvm, gfn_t gfn)
-{
-   return gfn;
-}
-
 static int __init kvm_s390_init(void)
 {
int ret;
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a1f0b5d..2d2509f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -367,24 +367,7 @@ struct kvm_vcpu_arch {
u64 hv_vapic;
 };
 
-struct kvm_mem_alias {
-   gfn_t base_gfn;
-   unsigned long npages;
-   gfn_t target_gfn;
-#define KVM_ALIAS_INVALID 1UL
-   unsigned long flags;
-};
-
-#define KVM_ARCH_HAS_UNALIAS_INSTANTIATION
-
-struct kvm_mem_aliases {
-   struct kvm_mem_alias aliases[KVM_ALIAS_SLOTS];
-   int naliases;
-};
-
 struct kvm_arch {
-   struct kvm_mem_aliases *aliases;
-
unsigned int n_free_mmu_pages;
unsigned int n_requested_mmu_pages;
unsigned int n_alloc_mmu_pages;
@@ -674,8 +657,6 @@ void kvm_disable_tdp(void);
 int load_pdptrs(struct kvm_vcpu *vcpu, unsigned long cr3);
 int complete_pio(struct kvm_vcpu *vcpu);
 
-struct kvm_memory_slot *gfn_to_memslot_unaliased(struct kvm *kvm, gfn_t gfn);
-
 static inline struct kvm_mmu_page *page_header(hpa_t shadow_page)
 {
struct page *page = pfn_to_page(shadow_page  PAGE_SHIFT);
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 43f1e9b..bf52a32 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -347,9 +347,9 @@ enum vmcs_field {
 
 #define AR_RESERVD_MASK 0xfffe0f00
 
-#define TSS_PRIVATE_MEMSLOT(KVM_MEMORY_SLOTS + 0)
-#define APIC_ACCESS_PAGE_PRIVATE_MEMSLOT   (KVM_MEMORY_SLOTS + 1)
-#define IDENTITY_PAGETABLE_PRIVATE_MEMSLOT (KVM_MEMORY_SLOTS + 2)
+#define TSS_PRIVATE_MEMSLOT(KVM_MEMORY_SLOTS + 
KVM_ALIAS_SLOTS + 0)
+#define APIC_ACCESS_PAGE_PRIVATE_MEMSLOT   (KVM_MEMORY_SLOTS + 
KVM_ALIAS_SLOTS + 1)
+#define 

Re: [PATCH] KVM: mark segments accessed on HW task switch

2010-01-25 Thread Gleb Natapov
On Mon, Jan 25, 2010 at 02:53:09PM +0200, Avi Kivity wrote:
 On 01/25/2010 02:24 PM, Gleb Natapov wrote:
 On Mon, Jan 25, 2010 at 01:12:36PM +0200, Avi Kivity wrote:
 On 01/25/2010 01:11 PM, Gleb Natapov wrote:
 On Mon, Jan 25, 2010 at 01:08:13PM +0200, Avi Kivity wrote:
 On 01/25/2010 12:01 PM, Gleb Natapov wrote:
 On HW task switch newly loaded segments should me marked as accessed.
 
 @@ -4775,6 +4766,11 @@ int kvm_load_segment_descriptor(struct kvm_vcpu 
 *vcpu, u16 selector,
  kvm_seg.unusable = 1;
 
  kvm_set_segment(vcpu,kvm_seg, seg);
 +if (selector!kvm_seg.unusablekvm_seg.s) {
 +/* mark segment as accessed */
 +seg_desc.type |= 1;
 +save_guest_segment_descriptor(vcpu, selector,seg_desc);
 +}
  return 0;
   }
 What about an error return from s_g_s_d?
 
 What can or should we do about it?
 
 If -EFAULT, propagate to userspace.
 
 We don't handle it anywhere in task switch emulation. Separate patch?
 
 
 Things like 'return kvm_write_guest_virt()' do handle it.
 
That what save_guest_segment_descriptor() calls, but error is not
propagated to userspace anywhere in the task switch code. Lets apply this
patch and I'll send follow up with fixes for error handling in task
switch code.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Add definitions for current cpu models..

2010-01-25 Thread Anthony Liguori

On 01/25/2010 03:08 AM, Dor Laor wrote:

qemu-config.[ch], taking a new command line that parses the argument via
QemuOpts, then passing the parsed options to a target-specific function
that then builds the table of supported cpus.

It should just be a matter of adding qemu_cpudefs_opts to

Isn't the outcome of John's patches and these configs will be exactly 
the same? Since these cpu models won't ever change, there is no reason 
why not to hard code them. Adding configs or command lines is a good 
idea but it is more friendlier to have basic support to the common cpus.

This is why qemu today offers: -cpu ?
x86   qemu64
x86   phenom
x86 core2duo
x86kvm64
x86   qemu32
x86  coreduo
x86  486
x86  pentium
x86 pentium2
x86 pentium3
x86   athlon
x86 n270

So bottom line, my point is to have John's base + your configs. We 
need to keep also the check verb and the migration support for sending 
those.


btw: IMO we should deal with this complexity ourselves and save 99.9% 
of the users the need to define such models, don't ask this from a 
java programmer, he is running on a JVM :-)


I'm suggesting John's base should be implemented as a default config 
that gets installed by default in QEMU.  The point is that a smart user 
(or a downstream) can modify this to suite their needs more appropriately.


Another way to look at this is that implementing a somewhat arbitrary 
policy within QEMU's .c files is something we should try to avoid.  
Implementing arbitrary policy in our default config file is a fine thing 
to do.  Default configs are suggested configurations that are modifiable 
by a user.  Something baked into QEMU is something that ought to work 
for everyone in all circumstances.


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: mark segments accessed on HW task switch

2010-01-25 Thread Avi Kivity

On 01/25/2010 04:22 PM, Gleb Natapov wrote:


Things like 'return kvm_write_guest_virt()' do handle it.

 

That what save_guest_segment_descriptor() calls, but error is not
propagated to userspace anywhere in the task switch code. Lets apply this
patch and I'll send follow up with fixes for error handling in task
switch code.
   


Okay.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-25 Thread Mark Cave-Ayland

Avi Kivity wrote:


Hi Avi,

I've just done a quick test re-enabling processor.sys on my WinXP 
guest and then did the following:


virsh stop winxp
rmmod kvm_intel
rmmod kvm
modprobe kvm ignore_msrs=1
modprobe kvm_intel
virsh start winxp

Unfortunately it still crashes with the same 
DRIVER_UNLOADED_WITHOUT_CANCELING_PENDING_OPERATIONS BSOD :(


Well, don't do that then.  Is there any specific functionality in 
processor.sys that you're missing?


No, not at all. My only concern was that the VM had been running 
absolutely fine under older KVM and VirtualBox until the upgrade from 
0.12.1.1 to 0.12.1.2 which made me think there had been a regression 
somewhere along the line.


I appreciate from tracking both qemu and kvm mailing lists that there is 
currently a lot of rapid development occuring across both QEMU and KVM, 
and hence sometimes things can break. It would be interesting to find 
out exactly *why* this doesn't work in KVM and so I can provide 
debugging assistance if you can point me in the right direction.


At the moment, I'm just happy that I can run the VM under KVM even with 
the processor.sys driver disabled. At least by bringing up the problem 
and solution on this mailing list thread then the solution is documented 
for other people who find themselves in the same situation.



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-25 Thread Avi Kivity

On 01/25/2010 05:15 PM, Mark Cave-Ayland wrote:
Unfortunately it still crashes with the same 
DRIVER_UNLOADED_WITHOUT_CANCELING_PENDING_OPERATIONS BSOD :(


Well, don't do that then.  Is there any specific functionality in 
processor.sys that you're missing?



No, not at all. My only concern was that the VM had been running 
absolutely fine under older KVM and VirtualBox until the upgrade from 
0.12.1.1 to 0.12.1.2 which made me think there had been a regression 
somewhere along the line.


Well, there was a regression, but it was in 0.12.1.1.

There were two bugs involved, a serious one (that caused the cpuid to 
show up as AMD) hiding the less serious one (that causes processor.sys 
to BSOD).




I appreciate from tracking both qemu and kvm mailing lists that there 
is currently a lot of rapid development occuring across both QEMU and 
KVM, and hence sometimes things can break. It would be interesting to 
find out exactly *why* this doesn't work in KVM and so I can provide 
debugging assistance if you can point me in the right direction.


At the moment, I'm just happy that I can run the VM under KVM even 
with the processor.sys driver disabled. At least by bringing up the 
problem and solution on this mailing list thread then the solution is 
documented for other people who find themselves in the same situation.


I'd like to find out why processor.sys fails, but the .1-.2 change 
isn't any help unfortunately.  It looks like here too there are two bugs 
involved: one in kvm which doesn't act like processor.sys expects it, 
and one in processor.sys which causes it to 
UNLOAD_ITSELF_WITHOUT_CANCELLING_PENDING_OPERATIONS.  You might try 
running with kvm trace enabled and look at msr and cpuid accesses just 
prior to the crash.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KVM Test: Utils - Handle git commit without tags

2010-01-25 Thread Lucas Meneghel Rodrigues
Sometimes the latest commit might not have any tags
associated, leading to a failure on git describe.
Let's handle this failure appropriately.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/kvm/kvm_utils.py |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/client/tests/kvm/kvm_utils.py b/client/tests/kvm/kvm_utils.py
index df26a77..a6da832 100644
--- a/client/tests/kvm/kvm_utils.py
+++ b/client/tests/kvm/kvm_utils.py
@@ -358,7 +358,11 @@ def get_git_branch(repository, branch, srcdir, 
commit=None, lbranch=None):
 utils.system(git checkout %s % commit)
 
 h = utils.system_output('git log --pretty=format:%H -1')
-desc = utils.system_output(git describe)
+try:
+desc = tag %s % utils.system_output(git describe)
+except CmdError:
+desc = no tag found
+
 logging.info(Commit hash for %s is %s (%s) % (repository, h.strip(),
 desc))
 return srcdir
-- 
1.6.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-25 Thread Mark Cave-Ayland

Avi Kivity wrote:


Well, there was a regression, but it was in 0.12.1.1.

There were two bugs involved, a serious one (that caused the cpuid to 
show up as AMD) hiding the less serious one (that causes processor.sys 
to BSOD).


Okay, understood. I must have just got lucky when upgrading from older 
KVM to 0.12.1.1.


I'd like to find out why processor.sys fails, but the .1-.2 change 
isn't any help unfortunately.  It looks like here too there are two bugs 
involved: one in kvm which doesn't act like processor.sys expects it, 
and one in processor.sys which causes it to 
UNLOAD_ITSELF_WITHOUT_CANCELLING_PENDING_OPERATIONS.  You might try 
running with kvm trace enabled and look at msr and cpuid accesses just 
prior to the crash.


I've had a quick look at this, and I can't work out how to enable kvm 
trace in Kconfig. Is there any documentation on how to build with kvm 
trace enabled for debugging?



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: Flush coalesced MMIO buffer periodly

2010-01-25 Thread Marcelo Tosatti
On Mon, Jan 25, 2010 at 03:46:44PM +0800, Sheng Yang wrote:
 The default action of coalesced MMIO is, cache the writing in buffer, until:
 1. The buffer is full.
 2. Or the exit to QEmu due to other reasons.
 
 But this would result in a very late writing in some condition.
 1. The each time write to MMIO content is small.
 2. The writing interval is big.
 3. No need for input or accessing other devices frequently.
 
 This issue was observed in a experimental embbed system. The test image
 simply print test every 1 seconds. The output in QEmu meets expectation,
 but the output in KVM is delayed for seconds.
 
 Per Avi's suggestion, I hooked flushing coalesced MMIO buffer in VGA update
 handler. By this way, We don't need vcpu explicit exit to QEmu to
 handle this issue.

Sheng,

Can you send this to QEMU upstream first, since the feature is present
there.

 Signed-off-by: Sheng Yang sh...@linux.intel.com
 ---
  cpu-all.h  |2 ++
  exec.c |6 ++
  kvm-all.c  |   20 
  qemu-kvm.c |9 +++--
  qemu-kvm.h |2 ++
  vl.c   |2 ++
  6 files changed, 39 insertions(+), 2 deletions(-)
 
 diff --git a/cpu-all.h b/cpu-all.h
 index 8ed76c7..51effc0 100644
 --- a/cpu-all.h
 +++ b/cpu-all.h
 @@ -916,6 +916,8 @@ void qemu_register_coalesced_mmio(target_phys_addr_t 
 addr, ram_addr_t size);
  
  void qemu_unregister_coalesced_mmio(target_phys_addr_t addr, ram_addr_t 
 size);
  
 +void qemu_flush_coalesced_mmio_buffer(void);
 +
  /***/
  /* host CPU ticks (if available) */
  
 diff --git a/exec.c b/exec.c
 index 99e88e1..40c01a1 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -2424,6 +2424,12 @@ void qemu_unregister_coalesced_mmio(target_phys_addr_t 
 addr, ram_addr_t size)
  kvm_uncoalesce_mmio_region(addr, size);
  }
  
 +void qemu_flush_coalesced_mmio_buffer(void)
 +{
 +if (kvm_enabled())
 +kvm_flush_coalesced_mmio_buffer();
 +}
 +
  #ifdef __linux__
  
  #include sys/vfs.h
 diff --git a/kvm-all.c b/kvm-all.c
 index 0423fff..3d9fcc0 100644
 --- a/kvm-all.c
 +++ b/kvm-all.c
 @@ -25,6 +25,9 @@
  #include hw/hw.h
  #include gdbstub.h
  #include kvm.h
 +#ifndef KVM_UPSTREAM
 +#include libkvm.h
 +#endif
  
  #ifdef KVM_UPSTREAM
  /* KVM uses PAGE_SIZE in it's definition of COALESCED_MMIO_MAX */
 @@ -385,6 +388,23 @@ int kvm_uncoalesce_mmio_region(target_phys_addr_t start, 
 ram_addr_t size)
  return ret;
  }
  
 +void kvm_flush_coalesced_mmio_buffer(void)
 +{
 +#ifdef KVM_CAP_COALESCED_MMIO
 +if (kvm_state-coalesced_mmio_ring) {
 +struct kvm_coalesced_mmio_ring *ring =
 +kvm_state-coalesced_mmio_ring;
 +while (ring-first != ring-last) {
 +
 cpu_physical_memory_rw(ring-coalesced_mmio[ring-first].phys_addr,
 +   ring-coalesced_mmio[ring-first].data[0],
 +   ring-coalesced_mmio[ring-first].len, 1);
 + smp_wmb();

Tab breakage.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-25 Thread Avi Kivity

On 01/25/2010 06:06 PM, Mark Cave-Ayland wrote:


I'd like to find out why processor.sys fails, but the .1-.2 change 
isn't any help unfortunately.  It looks like here too there are two 
bugs involved: one in kvm which doesn't act like processor.sys 
expects it, and one in processor.sys which causes it to 
UNLOAD_ITSELF_WITHOUT_CANCELLING_PENDING_OPERATIONS.  You might try 
running with kvm trace enabled and look at msr and cpuid accesses 
just prior to the crash.


I've had a quick look at this, and I can't work out how to enable kvm 
trace in Kconfig. Is there any documentation on how to build with kvm 
trace enabled for debugging?




CONFIG_FTRACE, CONFIG_TRACEPOINTS should be sufficient, I think.  2.6.32 
or later IIRC.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-25 Thread Mark Cave-Ayland

Avi Kivity wrote:

I've had a quick look at this, and I can't work out how to enable kvm 
trace in Kconfig. Is there any documentation on how to build with kvm 
trace enabled for debugging?


CONFIG_FTRACE, CONFIG_TRACEPOINTS should be sufficient, I think.  2.6.32 
or later IIRC.


Hmmm these already look as if they have been enabled. Does the output go 
to syslog or does it appear on a device somewhere in /proc?



ATB,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-25 Thread Avi Kivity

On 01/25/2010 06:18 PM, Mark Cave-Ayland wrote:

Avi Kivity wrote:

I've had a quick look at this, and I can't work out how to enable 
kvm trace in Kconfig. Is there any documentation on how to build 
with kvm trace enabled for debugging?


CONFIG_FTRACE, CONFIG_TRACEPOINTS should be sufficient, I think.  
2.6.32 or later IIRC.


Hmmm these already look as if they have been enabled. Does the output 
go to syslog or does it appear on a device somewhere in /proc?




echo kvm  /sys/kernel/debug/tracing/set_event
(you can also enable just the msr and cpuid events, see 
/sys/kernel/debug/tracing/events)


cat /sys/kernel/debug/trace

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio bonding bandwidth problem

2010-01-25 Thread Didier Moens
On 22/01/10 23:33, Brian Jackson wrote:


 1. I am experiencing a 40% performance hit (600 Mb/s) on each individual
 virtio guest connection ;
 
 I don't know what all features RHEL5.4 enables for kvm, but that doesn't seem 
 outside the realm of possibility. Especially depending on what OS is running 
 in the guest. I think RHEL5.4 has an older version of virtio, but I won't 
 swear to it. Fwiw, I get ~1.5Gbps guest to host on a Ubuntu 9.10 guest, 
 ~850mbit/s guest to host on a Windows 7 guest. To get those speeds, I have to 
 up the window sizes a good bit (the default is 8K, those numbers are at 1M). 
 At the default Windows 7 gets ~250mbit/s. 
   

Thank you, Brian ; your reply is much appreciated.



I took a shortcut, and installed RHEL's ktune package (on both hosts and
guests), which tunes the following parameters :

net.core.rmem_default = 262144
net.core.wmem_default = 262144
net.core.rmem_max = 8388608
net.core.wmem_max = 8388608
net.core.netdev_max_backlog = 1
net.ipv4.tcp_rmem = 8192 87380 8388608
net.ipv4.tcp_wmem = 8192 65536 8388608
net.ipv4.udp_rmem_min = 16384
net.ipv4.udp_wmem_min = 16384
net.ipv4.tcp_mem = 8388608 12582912 16777216
net.ipv4.udp_mem = 8388608 12582912 16777216
vm.swappiness = 30
vm.dirty_ratio = 50
vm.pagecache = 90


Thanks to these changes, external host to guest is up from ~600 Mb/s to
the expected ~950 Mb/s (guest to bare metal is still ~1.2Gb/s).



 2. Total simultaneous bandwidth to all guests seems to be capped at 1
 Gb/s ; quite problematic, as this renders my server consolidation almost
 useless.
 
 I don't know about 802.3ad bonding, but I know the other linux bonding 
 techniques are very hard to benchmark due to the way the mac's are handled. I 
 would start by examining/describing your testing a little more. At the very 
 least what tools you're using to test, etc. would be helpful.
   

- I am using iperf-2.0.4 for bandwidth testing :
iperf -s on server side (TCP window size: 85.3 KB)
iperf -c on client side (TCP window size: 27.5 or 64.0 KB)

- Network :
* 2x interconnected Allied Telesis AT-x908 (with 60 Gb/s switching
backplane)
* hostA , 3x 1 Gb 802.3ad , with bridged virtio guests :
virtA1-virtA2-virtA3
* hostB , 3x 1 Gb 802.3ad
* hostC , 1 Gb
* hostD , 1 Gb

- Tests (simultaneous connections from clients - servers in Mb/s) :
1. B,C,D - A   :   990,600,700 Mb/s = 2.3 Gb/s , which confirms a
successful 802.3ad setup for hostA
2. B,B,B - A   :   450,300,250 Mb/s = 1 Gb/s : due to MAC handling,
maximum bandwidth is maybe limited to 1 Gb/s per host-host connection ?
3. B - A1 : 980 Mb/s
4. C - A2 : 650 Mb/s
5. B,B - A1,A2 : 340,650 Mb/s (limited host-host ?)
6. B,C - A1,A2 : 900,100 Mb/s
7. B,C,D - A1,A2,A3 : 750,150,100 Mb/s
8. A1,A2 - B,B : 500,500 Mb/s (limited host-host ?)
9. A1,A2 - B,C : 980,730 Mb/s

- Results [2], [5] and [8]  seem to imply a 1 Gb bandwidth limit for a
single physical client-server connection ;
- Result [9] indicates  1Gb/s bandwidth for outgoing virtio client
guest connections (to different servers) ;

- Results [6] and [7] illustrate the problem : incoming bandwidth to all
virtio guests is capped at 1 Gb/s.



Best regards,
Didier
-- 

Didier Moens , IT services
Department for Molecular Biomedical Research (DMBR)
VIB - Ghent University


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Seabios - read e820 reserve from qemu_cfg

2010-01-25 Thread Jes Sorensen

Hi,

Right now KVM/QEMU relies on hard coded values in Seabios for the
reserved area for the TSS pages and the EPT page.

I'd like to suggest we change this to pass the value from QEMU via
qemu-cfg making it possible to move it around dynamically in the future.

Attached is a patch to Seabios for this, which defaults to the current
hard coded value if no value is provided by qemu-cfg. We can remove
the backwards compatibility later.

I'll post the QEMU patches for upstream QEMU and QEMU-KVM in a minute.

Comments most welcome!

Cheers,
Jes

Read location and size of KVM switch area from qemu-cfg

Read location of KVM's switch area (VMX TSS pages and EPT) from QEMU
via qemu-cfg instead of relying on hard coded values.

For now, fall back to the old hard coded values for compatibility
reasons. Compatibility code can be removed in the future.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com

---
 src/paravirt.c |9 +
 src/paravirt.h |7 +++
 src/post.c |   14 ++
 3 files changed, 26 insertions(+), 4 deletions(-)

Index: seabios/src/paravirt.c
===
--- seabios.orig/src/paravirt.c
+++ seabios/src/paravirt.c
@@ -132,6 +132,15 @@ u16 qemu_cfg_smbios_entries(void)
 return cnt;
 }
 
+int qemu_cfg_read_e820_reserve(struct qemu_cfg_e820_reserve *reserve)
+{
+if (!qemu_cfg_present)
+return 0;
+
+qemu_cfg_read((void *)reserve, sizeof(*reserve));
+return reserve-length;
+}
+
 struct smbios_header {
 u16 length;
 u8 type;
Index: seabios/src/paravirt.h
===
--- seabios.orig/src/paravirt.h
+++ seabios/src/paravirt.h
@@ -36,6 +36,7 @@ static inline int kvm_para_available(voi
 #define QEMU_CFG_ACPI_TABLES   (QEMU_CFG_ARCH_LOCAL + 0)
 #define QEMU_CFG_SMBIOS_ENTRIES(QEMU_CFG_ARCH_LOCAL + 1)
 #define QEMU_CFG_IRQ0_OVERRIDE (QEMU_CFG_ARCH_LOCAL + 2)
+#define QEMU_CFG_E820_RESERVE  (QEMU_CFG_ARCH_LOCAL + 3)
 
 extern int qemu_cfg_present;
 
@@ -61,8 +62,14 @@ typedef struct QemuCfgFile {
 char name[56];
 } QemuCfgFile;
 
+struct qemu_cfg_e820_reserve {
+u32 addr;
+u32 length;
+};
+
 u16 qemu_cfg_first_file(QemuCfgFile *entry);
 u16 qemu_cfg_next_file(QemuCfgFile *entry);
 u32 qemu_cfg_read_file(QemuCfgFile *entry, void *dst, u32 maxlen);
+int qemu_cfg_read_e820_reserve(struct qemu_cfg_e820_reserve *reserve);
 
 #endif
Index: seabios/src/post.c
===
--- seabios.orig/src/post.c
+++ seabios/src/post.c
@@ -135,10 +135,16 @@ ram_probe(void)
  , E820_RESERVED);
 add_e820(BUILD_BIOS_ADDR, BUILD_BIOS_SIZE, E820_RESERVED);
 
-if (kvm_para_available())
-// 4 pages before the bios, 3 pages for vmx tss pages, the
-// other page for EPT real mode pagetable
-add_e820(0xfffbc000, 4*4096, E820_RESERVED);
+if (kvm_para_available()) {
+struct qemu_cfg_e820_reserve e820_reserve;
+if (qemu_cfg_read_e820_reserve(e820_reserve))
+add_e820(e820_reserve.addr, e820_reserve.length, E820_RESERVED);
+else {
+// 4 pages before the bios, 3 pages for vmx tss pages, the
+// other page for EPT real mode pagetable
+add_e820(0xfffbc000, 4*4096, E820_RESERVED);
+}
+}
 
 dprintf(1, Ram Size=0x%08x (0x%08x%08x high)\n
 , RamSize, (u32)(RamSizeOver4G  32), (u32)RamSizeOver4G);


[PATCH] QEMU-KVM - provide e820 reserve through qemu_cfg

2010-01-25 Thread Jes Sorensen

Hi,

This is the QEMU-KVM bits for providing the e820-reserve space through
qemu-cfg.

Cheers,
Jes

Use qemu-cfg to notify the BIOS of the location of the TSS range to
reserve in the e820 table, to avoid relying on hard coded values.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com

---
 hw/fw_cfg.h   |5 +
 hw/pc.c   |4 
 kvm.h |2 ++
 qemu-kvm-x86.c|6 ++
 target-i386/kvm.c |7 +++
 5 files changed, 24 insertions(+)

Index: qemu-kvm/hw/fw_cfg.h
===
--- qemu-kvm.orig/hw/fw_cfg.h
+++ qemu-kvm/hw/fw_cfg.h
@@ -67,4 +67,9 @@ FWCfgState *fw_cfg_init(uint32_t ctl_por
 
 #endif /* NO_QEMU_PROTOS */
 
+struct fw_cfg_e820_reserve {
+uint32_t addr;
+uint32_t length;
+};
+
 #endif
Index: qemu-kvm/hw/pc.c
===
--- qemu-kvm.orig/hw/pc.c
+++ qemu-kvm/hw/pc.c
@@ -66,6 +66,7 @@
 #define FW_CFG_ACPI_TABLES (FW_CFG_ARCH_LOCAL + 0)
 #define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1)
 #define FW_CFG_IRQ0_OVERRIDE (FW_CFG_ARCH_LOCAL + 2)
+#define FW_CFG_E820_RESERVE (FW_CFG_ARCH_LOCAL + 3)
 
 #define MAX_IDE_BUS 2
 
@@ -73,6 +74,7 @@ static fdctrl_t *floppy_controller;
 static RTCState *rtc_state;
 static PITState *pit;
 static PCII440FXState *i440fx_state;
+struct fw_cfg_e820_reserve e820_reserve;
 
 qemu_irq *ioapic_irq_hack;
 
@@ -475,6 +477,8 @@ static void *bochs_bios_init(void)
 if (smbios_table)
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SMBIOS_ENTRIES,
  smbios_table, smbios_len);
+fw_cfg_add_bytes(fw_cfg, FW_CFG_E820_RESERVE, (uint8_t *)e820_reserve,
+ sizeof(struct fw_cfg_e820_reserve));
 
 /* allocate memory for the NUMA channel: one (64bit) word for the number
  * of nodes, one word for each VCPU-node and one word for each node to
Index: qemu-kvm/kvm.h
===
--- qemu-kvm.orig/kvm.h
+++ qemu-kvm/kvm.h
@@ -101,6 +101,8 @@ void kvm_arch_reset_vcpu(CPUState *env);
 struct kvm_guest_debug;
 struct kvm_debug_exit_arch;
 
+extern struct fw_cfg_e820_reserve e820_reserve;
+
 struct kvm_sw_breakpoint {
 target_ulong pc;
 target_ulong saved_insn;
Index: qemu-kvm/qemu-kvm-x86.c
===
--- qemu-kvm.orig/qemu-kvm-x86.c
+++ qemu-kvm/qemu-kvm-x86.c
@@ -23,6 +23,7 @@
 
 #include kvm.h
 #include hw/pc.h
+#include hw/fw_cfg.h
 
 #define MSR_IA32_TSC   0x10
 
@@ -37,6 +38,11 @@ int kvm_set_tss_addr(kvm_context_t kvm, 
 {
 #ifdef KVM_CAP_SET_TSS_ADDR
int r;
+/*
+ * Tell fw_cfg to notify the BIOS to reserve the range.
+ */
+e820_reserve.addr = addr;
+e820_reserve.length = 0x4000;
 
r = kvm_ioctl(kvm_state, KVM_CHECK_EXTENSION, KVM_CAP_SET_TSS_ADDR);
if (r  0) {
Index: qemu-kvm/target-i386/kvm.c
===
--- qemu-kvm.orig/target-i386/kvm.c
+++ qemu-kvm/target-i386/kvm.c
@@ -25,6 +25,8 @@
 #include gdbstub.h
 #include host-utils.h
 
+extern struct fw_cfg_e820_reserve e820_reserve;
+
 #ifdef KVM_UPSTREAM
 //#define DEBUG_KVM
 
@@ -298,6 +300,11 @@ int kvm_arch_init(KVMState *s, int smp_c
  * as unavaible memory.  FIXME, need to ensure the e820 map deals with
  * this?
  */
+/*
+ * Tell fw_cfg to notify the BIOS to reserve the range.
+ */
+e820_reserve.addr = 0xfffbc000;
+e820_reserve.length = 0x4000;
 return kvm_vm_ioctl(s, KVM_SET_TSS_ADDR, 0xfffbd000);
 }
 


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-25 Thread Mark Cave-Ayland

Avi Kivity wrote:


echo kvm  /sys/kernel/debug/tracing/set_event
(you can also enable just the msr and cpuid events, see 
/sys/kernel/debug/tracing/events)


cat /sys/kernel/debug/trace


(goes on a kernel debugging crash course)

Okay I think I've got the information you need for msr and cpuid:

http://pastebin.com/m39e26e6e

This is from a fresh start of the VM up to just after the point where it 
reaches the BSOD.



HTH,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] QEMU - provide e820 reserve through qemu_cfg

2010-01-25 Thread Alexander Graf

On 25.01.2010, at 17:52, Jes Sorensen wrote:

 Hi,
 
 This is the QEMU patch for providing the e820-reserve space through
 qemu-cfg.

Howdy. Congratulations to the new mail address - looks neat ;-).


Two comments:

1) I don't see how passing a single region is any help. I'd rather like to see 
a device tree like table structure
You'd get one variable for len of the table, one with the contents. So for a 
universal reserved region specifier you'd get:

u64 baseu64 len

Then have len=2 and put data in the table:

u64 base1u64 len1u64 base2u64 len2

That way we'd get 2 entries and the chance to enhance them later on. In fact, 
it might even make sense to pass the whole table in such a form. That way qemu 
generates all of the e820 tables and we can declare whatever we want. Just add 
a type field in the table.

2) Please inline patches. They showed up as attachments here, making them 
really hard to comment on.


Alex--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL] KVM updates for 2.6.33-rc5

2010-01-25 Thread Marcelo Tosatti

Linus, please pull from

git://git.kernel.org/pub/scm/virt/kvm/kvm.git kvm-updates/2.6.33

For the following KVM fixes.

Alexander Graf (1):
  KVM: powerpc: Show timing option only on embedded

Avi Kivity (1):
  KVM: Fix race between APIC TMR and IRR

Christian Borntraeger (1):
  KVM: S390: fix potential array overrun in intercept handling

Davide Libenzi (1):
  eventfd - allow atomic read and waitqueue remove

Marcelo Tosatti (2):
  KVM: properly check max PIC pin in irq route setup
  KVM: MMU: bail out pagewalk on kvm_read_guest error

Michael S. Tsirkin (2):
  KVM: only allow one gsi per fd
  KVM: fix spurious interrupt with irqfd

Sheng Yang (1):
  KVM: x86: Fix host_mapping_level()

Wei Yongjun (2):
  KVM: x86: Fix probable memory leak of vcpu-arch.mce_banks
  KVM: x86: Fix leak of free lapic date in kvm_arch_vcpu_init()


 arch/powerpc/kvm/Kconfig   |2 +-
 arch/s390/kvm/intercept.c  |4 +-
 arch/x86/kvm/lapic.c   |   11 +++--
 arch/x86/kvm/mmu.c |6 +--
 arch/x86/kvm/paging_tmpl.h |4 +-
 arch/x86/kvm/x86.c |6 ++-
 fs/eventfd.c   |   89 ---
 include/linux/eventfd.h|   16 
 virt/kvm/eventfd.c |   18 +++-
 virt/kvm/irq_comm.c|6 ++-
 10 files changed, 128 insertions(+), 34 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] QEMU - provide e820 reserve through qemu_cfg

2010-01-25 Thread Alexander Graf


Am 25.01.2010 um 18:13 schrieb Jes Sorensen jes.soren...@redhat.com:


On 01/25/10 17:58, Alexander Graf wrote:

Howdy. Congratulations to the new mail address - looks neat ;-).


:-)


Two comments:

1) I don't see how passing a single region is any help. I'd rather  
like to see a device tree like table structure
You'd get one variable for len of the table, one with the contents.  
So for a universal reserved region specifier you'd get:


u64 baseu64 len

Then have len=2 and put data in the table:

u64 base1u64 len1u64 base2u64 len2

That way we'd get 2 entries and the chance to enhance them later  
on. In fact, it might even make sense to pass the whole table in  
such a form. That way qemu generates all of the e820 tables and we  
can declare whatever we want. Just add a type field in the table.


I am fine with having QEMU build the e820 tables completely if there  
is

a consensus to take that path.


I agree. We better get this right :-). I don't want to maintain 5  
versions of an 380 fw_cfg interface.




2) Please inline patches. They showed up as attachments here,  
making them really hard to comment on.


Sorry Thunderbug doesn't do that well, but they should be attached as
txt?


I suppose so, but Apple Mail still doesn't show it and definitely  
doesn't let me comment on it.


The easiest way for me so far has been to use git-send-email and a  
bash alias to fill in login information.



Alex



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Fix failed msr tracing

2010-01-25 Thread Avi Kivity
We don't trace failed msr access (wrmsr or rdmsr which end up generating a
#GP), which loses important data.

Avi Kivity (2):
  KVM: Fix msr trace
  KVM: Trace failed msr reads and writes

 arch/x86/kvm/svm.c   |   13 -
 arch/x86/kvm/trace.h |   27 ---
 arch/x86/kvm/vmx.c   |5 +++--
 3 files changed, 27 insertions(+), 18 deletions(-)

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] KVM: Fix msr trace

2010-01-25 Thread Avi Kivity
- data is 64 bits wide, not unsigned long
- rw is confusingly named

Signed-off-by: Avi Kivity a...@redhat.com
---
 arch/x86/kvm/trace.h |   16 
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 1cb3d0e..45903a9 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -246,23 +246,23 @@ TRACE_EVENT(kvm_page_fault,
  * Tracepoint for guest MSR access.
  */
 TRACE_EVENT(kvm_msr,
-   TP_PROTO(unsigned int rw, unsigned int ecx, unsigned long data),
-   TP_ARGS(rw, ecx, data),
+   TP_PROTO(unsigned write, u32 ecx, u64 data),
+   TP_ARGS(write, ecx, data),
 
TP_STRUCT__entry(
-   __field(unsigned int,   rw  )
-   __field(unsigned int,   ecx )
-   __field(unsigned long,  data)
+   __field(unsigned,   write   )
+   __field(u32,ecx )
+   __field(u64,data)
),
 
TP_fast_assign(
-   __entry-rw = rw;
+   __entry-write  = write;
__entry-ecx= ecx;
__entry-data   = data;
),
 
-   TP_printk(msr_%s %x = 0x%lx,
- __entry-rw ? write : read,
+   TP_printk(msr_%s %x = 0x%llx,
+ __entry-write ? write : read,
  __entry-ecx, __entry-data)
 );
 
-- 
1.6.5.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-25 Thread Avi Kivity

On 01/25/2010 06:54 PM, Mark Cave-Ayland wrote:

Avi Kivity wrote:


echo kvm  /sys/kernel/debug/tracing/set_event
(you can also enable just the msr and cpuid events, see 
/sys/kernel/debug/tracing/events)


cat /sys/kernel/debug/trace


(goes on a kernel debugging crash course)

Okay I think I've got the information you need for msr and cpuid:

http://pastebin.com/m39e26e6e

This is from a fresh start of the VM up to just after the point where 
it reaches the BSOD.





Unfortunately msr tracing fails to record some important information.  I 
just posted a patch to fix this.  Can you rerun from kvm.git branch 
msr-trace?  That contains the fix.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] QEMU - provide e820 reserve through qemu_cfg

2010-01-25 Thread Jes Sorensen

On 01/25/10 18:28, Alexander Graf wrote:

That way we'd get 2 entries and the chance to enhance them later on.
In fact, it might even make sense to pass the whole table in such a
form. That way qemu generates all of the e820 tables and we can
declare whatever we want. Just add a type field in the table.


I am fine with having QEMU build the e820 tables completely if there is
a consensus to take that path.


I agree. We better get this right :-). I don't want to maintain 5
versions of an 380 fw_cfg interface.


Looking at the internals, some of the e820 entries are based on compile
time constants for the BIOS, so it will be hard to pass those from
QEMU, but we could do it in a way so we pass a number of additional
e820 entries. Ie. address, length, and type.

What do you think?

Cheers,
Jes
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2 FIXED] KVM: Trace failed msr reads and writes

2010-01-25 Thread Avi Kivity
Record failed msrs reads and writes, and the fact that they failed as well.

Signed-off-by: Avi Kivity a...@redhat.com
---

(cosmetic indentation fix)

 arch/x86/kvm/svm.c   |   13 -
 arch/x86/kvm/trace.h |   17 +++--
 arch/x86/kvm/vmx.c   |5 +++--
 3 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 8d7cb62..a0da182 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2172,9 +2172,10 @@ static int rdmsr_interception(struct vcpu_svm *svm)
u32 ecx = svm-vcpu.arch.regs[VCPU_REGS_RCX];
u64 data;
 
-   if (svm_get_msr(svm-vcpu, ecx, data))
+   if (svm_get_msr(svm-vcpu, ecx, data)) {
+   trace_kvm_msr_read_ex(ecx);
kvm_inject_gp(svm-vcpu, 0);
-   else {
+   } else {
trace_kvm_msr_read(ecx, data);
 
svm-vcpu.arch.regs[VCPU_REGS_RAX] = data  0x;
@@ -2266,13 +2267,15 @@ static int wrmsr_interception(struct vcpu_svm *svm)
u64 data = (svm-vcpu.arch.regs[VCPU_REGS_RAX]  -1u)
| ((u64)(svm-vcpu.arch.regs[VCPU_REGS_RDX]  -1u)  32);
 
-   trace_kvm_msr_write(ecx, data);
 
svm-next_rip = kvm_rip_read(svm-vcpu) + 2;
-   if (svm_set_msr(svm-vcpu, ecx, data))
+   if (svm_set_msr(svm-vcpu, ecx, data)) {
+   trace_kvm_msr_write_ex(ecx, data);
kvm_inject_gp(svm-vcpu, 0);
-   else
+   } else {
+   trace_kvm_msr_write(ecx, data);
skip_emulated_instruction(svm-vcpu);
+   }
return 1;
 }
 
diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 45903a9..df55a66 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -246,28 +246,33 @@ TRACE_EVENT(kvm_page_fault,
  * Tracepoint for guest MSR access.
  */
 TRACE_EVENT(kvm_msr,
-   TP_PROTO(unsigned write, u32 ecx, u64 data),
-   TP_ARGS(write, ecx, data),
+   TP_PROTO(unsigned write, u32 ecx, u64 data, bool exception),
+   TP_ARGS(write, ecx, data, exception),
 
TP_STRUCT__entry(
__field(unsigned,   write   )
__field(u32,ecx )
__field(u64,data)
+   __field(u8, exception   )
),
 
TP_fast_assign(
__entry-write  = write;
__entry-ecx= ecx;
__entry-data   = data;
+   __entry-exception  = exception;
),
 
-   TP_printk(msr_%s %x = 0x%llx,
+   TP_printk(msr_%s %x = 0x%llx%s,
  __entry-write ? write : read,
- __entry-ecx, __entry-data)
+ __entry-ecx, __entry-data,
+ __entry-exception ?  (#GP) : )
 );
 
-#define trace_kvm_msr_read(ecx, data)  trace_kvm_msr(0, ecx, data)
-#define trace_kvm_msr_write(ecx, data) trace_kvm_msr(1, ecx, data)
+#define trace_kvm_msr_read(ecx, data)  trace_kvm_msr(0, ecx, data, false)
+#define trace_kvm_msr_write(ecx, data) trace_kvm_msr(1, ecx, data, false)
+#define trace_kvm_msr_read_ex(ecx) trace_kvm_msr(0, ecx, 0, true)
+#define trace_kvm_msr_write_ex(ecx, data)  trace_kvm_msr(1, ecx, data, true)
 
 /*
  * Tracepoint for guest CR access.
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9f56110..0846e55 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3143,6 +3143,7 @@ static int handle_rdmsr(struct kvm_vcpu *vcpu)
u64 data;
 
if (vmx_get_msr(vcpu, ecx, data)) {
+   trace_kvm_msr_read_ex(ecx);
kvm_inject_gp(vcpu, 0);
return 1;
}
@@ -3162,13 +3163,13 @@ static int handle_wrmsr(struct kvm_vcpu *vcpu)
u64 data = (vcpu-arch.regs[VCPU_REGS_RAX]  -1u)
| ((u64)(vcpu-arch.regs[VCPU_REGS_RDX]  -1u)  32);
 
-   trace_kvm_msr_write(ecx, data);
-
if (vmx_set_msr(vcpu, ecx, data) != 0) {
+   trace_kvm_msr_write_ex(ecx, data);
kvm_inject_gp(vcpu, 0);
return 1;
}
 
+   trace_kvm_msr_write(ecx, data);
skip_emulated_instruction(vcpu);
return 1;
 }
-- 
1.6.5.3

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to properly turn off guest VM on server shutdown?

2010-01-25 Thread Michael Tokarev
Jernej Simončič wrote:
 On Sunday, January 24, 2010, 19:28:47, Jean-Philippe Menil wrote:
 
 Maybe the same can be done with windows guest.
 
 Should work with any Windows, and verified to work with Vista x64
 guest.

The only problem is that windows does not want to shut down when you
need it.

All versions of windows server requires enabling something in the
registry - to notice the power down events to start with.

If a windows machine is used by someone else (open files, logged in
user etc), it brings a dialog box in response to power down event
asking if you _really_ want to shut down since this machine is
used over network.

And finally, quite often during screensaver work windows notices
the power down event only after some other event such as mouse
move or a keypress.

In order to shut down my windows guests I come to this version:

  {
# moving mouse helps windows (xp) to notice the powerdown event
echo mouse_move 1 1
sleep .1
echo system_powerdown
sleep 1
# also for windows, if it asks ok to shutdown if in use?
echo sendkey ret
sleep .1
  } | \
nc -U -w2 -q2 $run/$name/mon  /dev/null


That's netcat connecting to the guest's monitor which is a unix socket.

The script performs similar task for all guests in first cycle,
next it repeats the procedure but now waits for $max_guest_waittime,
which should be sufficient for any guest to shut down.  If the guest
did not shut down in time, the script simple kills the guest.

Note that sleep 1 in the above is not necessary sufficient, as
(windows) guest might be in swap and might need some time to
draw the dialog box.

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: How to properly turn off guest VM on server shutdown?

2010-01-25 Thread Jan Kiszka
Michael Tokarev wrote:
 Jernej Simončič wrote:
 On Sunday, January 24, 2010, 19:28:47, Jean-Philippe Menil wrote:

 Maybe the same can be done with windows guest.
 Should work with any Windows, and verified to work with Vista x64
 guest.
 
 The only problem is that windows does not want to shut down when you
 need it.
 
 All versions of windows server requires enabling something in the
 registry - to notice the power down events to start with.
 
 If a windows machine is used by someone else (open files, logged in
 user etc), it brings a dialog box in response to power down event
 asking if you _really_ want to shut down since this machine is
 used over network.
 
 And finally, quite often during screensaver work windows notices
 the power down event only after some other event such as mouse
 move or a keypress.
 
 In order to shut down my windows guests I come to this version:
 
   {
 # moving mouse helps windows (xp) to notice the powerdown event
 echo mouse_move 1 1
 sleep .1
 echo system_powerdown
 sleep 1
 # also for windows, if it asks ok to shutdown if in use?
 echo sendkey ret
 sleep .1
   } | \
 nc -U -w2 -q2 $run/$name/mon  /dev/null
 
 
 That's netcat connecting to the guest's monitor which is a unix socket.
 
 The script performs similar task for all guests in first cycle,
 next it repeats the procedure but now waits for $max_guest_waittime,
 which should be sufficient for any guest to shut down.  If the guest
 did not shut down in time, the script simple kills the guest.
 
 Note that sleep 1 in the above is not necessary sufficient, as
 (windows) guest might be in swap and might need some time to
 draw the dialog box.

A cleaner alternative might be emulating the monitoring interface of
some standard UPS. It's somehow the same scenario: The virtual power is
about to vanish, let's inform the guest to shut down properly. And when
choosing a serial link, that should even be possible without modifying QEMU.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RFC: alias rework

2010-01-25 Thread Izik Eidus
On Mon, 25 Jan 2010 17:45:53 -0200
Marcelo Tosatti mtosa...@redhat.com wrote:

 Izik,
 
 On Mon, Jan 25, 2010 at 03:53:44PM +0200, Izik Eidus wrote:
  From f94dcd1ccabbcdb51ed7c37c5f58f00a5c1b7eec Mon Sep 17 00:00:00 2001
  From: Izik Eidus iei...@redhat.com
  Date: Mon, 25 Jan 2010 15:49:41 +0200
  Subject: [PATCH] RFC: alias rework
  
  This patch remove the old way of aliasing inside kvm
  and move into using aliasing with the same virtual addresses
  
  This patch is really just early RFC just to know if you guys
  like this direction, and I need to clean some parts of it
  and test it more before I feel it ready to be merged...
  
  Comments are more than welcome.
  
  Thanks.
  
  Signed-off-by: Izik Eidus iei...@redhat.com
  ---
   arch/ia64/include/asm/kvm_host.h |1 +
   arch/ia64/kvm/kvm-ia64.c |5 --
   arch/powerpc/kvm/powerpc.c   |5 --
   arch/s390/include/asm/kvm_host.h |1 +
   arch/s390/kvm/kvm-s390.c |5 --
   arch/x86/include/asm/kvm_host.h  |   19 --
   arch/x86/include/asm/vmx.h   |6 +-
   arch/x86/kvm/mmu.c   |   19 ++-
   arch/x86/kvm/x86.c   |  114 
  +++--
   include/linux/kvm_host.h |   11 +--
   virt/kvm/kvm_main.c  |   80 +++---
   11 files changed, 107 insertions(+), 159 deletions(-)
  
 
  @@ -2661,7 +2611,18 @@ int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm,
  struct kvm_memslots *slots, *old_slots;
   
  spin_lock(kvm-mmu_lock);
  +   for (i = KVM_MEMORY_SLOTS; i  KVM_MEMORY_SLOTS +
  + KVM_ALIAS_SLOTS; ++i) {
 
 The plan is to kill KVM_ALIAS_SLOTS (aliases will share the 32 mem
 slots), right?

Hrmm I think we got to have this addition 4 KVM_MEMORY_SLOTS to keep
the same beahivor with old userspaces
beacuse maybe some userspace apps use 32 slots already?

I dont mind remove it if you guys don`t think this is the case.

 
  +#ifdef CONFIG_X86
  +
  +static void update_alias_slots(struct kvm *kvm, struct kvm_memory_slot 
  *slot)
  +{
  +   int i;
  +
  +   for (i = KVM_MEMORY_SLOTS; i  KVM_MEMORY_SLOTS + KVM_ALIAS_SLOTS;
  +++i) {
  +   struct kvm_memory_slot *alias_memslot =
  +   kvm-memslots-memslots[i];
  +   unsigned long size = slot-npages  PAGE_SHIFT;
  +
  +   if (alias_memslot-real_base_gfn = slot-base_gfn 
  +   alias_memslot-real_base_gfn  slot-base_gfn + size) {
  +   if (slot-dirty_bitmap) {
  +   unsigned long bitmap_addr;
  +   unsigned long dirty_offset;
  +   unsigned long offset_addr =
  +   (alias_memslot-real_base_gfn -
  +   slot-base_gfn)  PAGE_SHIFT;
  +   alias_memslot-userspace_addr = 
  +   slot-userspace_addr + offset_addr;
  +
  +   dirty_offset =
  +   ALIGN(offset_addr, BITS_PER_LONG) / 8;
  +   bitmap_addr = (unsigned long) 
  slot-dirty_bitmap;
  +   bitmap_addr += dirty_offset;
  +   alias_memslot-dirty_bitmap = (unsigned long 
  *)bitmap_addr;
  +   alias_memslot-base_gfn = 
  alias_memslot-real_base_gfn;
  +   alias_memslot-npages = 
  alias_memslot-real_npages;
  +   } else if (!slot-rmap) {
  +   alias_memslot-base_gfn = 0;
  +   alias_memslot-npages = 0;
  +   }
  +   }
  +   }
  +}
  +
  +#endif
 
 Can't see why is this needed. What is the problem with nuking child
 aliases when deleting a real memslot?

The problem is that this memslot still point in the virtual address of the host,
This mean that gfn_to_memslot/page will still work on gfns and will result in
pages that are mapped into the virtual address that the userspace requested to
remove from KVM.

Thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] QEMU - provide e820 reserve through qemu_cfg

2010-01-25 Thread Alexander Graf

On 25.01.2010, at 18:46, Jes Sorensen wrote:

 On 01/25/10 18:28, Alexander Graf wrote:
 That way we'd get 2 entries and the chance to enhance them later on.
 In fact, it might even make sense to pass the whole table in such a
 form. That way qemu generates all of the e820 tables and we can
 declare whatever we want. Just add a type field in the table.
 
 I am fine with having QEMU build the e820 tables completely if there is
 a consensus to take that path.
 
 I agree. We better get this right :-). I don't want to maintain 5
 versions of an 380 fw_cfg interface.
 
 Looking at the internals, some of the e820 entries are based on compile
 time constants for the BIOS, so it will be hard to pass those from
 QEMU, but we could do it in a way so we pass a number of additional
 e820 entries. Ie. address, length, and type.

Yes, sounds good. Should be fairly extensible then. What about memory holes? Do 
we need to take care of them?

Alex--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] QEMU - provide e820 reserve through qemu_cfg

2010-01-25 Thread Anthony Liguori

On 01/25/2010 02:04 PM, Alexander Graf wrote:

On 25.01.2010, at 18:46, Jes Sorensen wrote:

   

On 01/25/10 18:28, Alexander Graf wrote:
 

That way we'd get 2 entries and the chance to enhance them later on.
In fact, it might even make sense to pass the whole table in such a
form. That way qemu generates all of the e820 tables and we can
declare whatever we want. Just add a type field in the table.
   

I am fine with having QEMU build the e820 tables completely if there is
a consensus to take that path.
 

I agree. We better get this right :-). I don't want to maintain 5
versions of an 380 fw_cfg interface.
   

Looking at the internals, some of the e820 entries are based on compile
time constants for the BIOS, so it will be hard to pass those from
QEMU, but we could do it in a way so we pass a number of additional
e820 entries. Ie. address, length, and type.
 

Yes, sounds good. Should be fairly extensible then. What about memory holes? Do 
we need to take care of them?
   


It would be nice for QEMU to be able to add additional e820 regions that 
don't necessarily fit the standard layout model.


For instance, I've thought a number of times about using a large 
reserved region as a shared memory mechanism.


But we certainly need to allow the BIOS to define the regions it needs 
to know about.


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] QEMU - provide e820 reserve through qemu_cfg

2010-01-25 Thread Jes Sorensen

On 01/25/10 21:14, Anthony Liguori wrote:

On 01/25/2010 02:04 PM, Alexander Graf wrote:

Yes, sounds good. Should be fairly extensible then. What about memory
holes? Do we need to take care of them?


It would be nice for QEMU to be able to add additional e820 regions that
don't necessarily fit the standard layout model.

For instance, I've thought a number of times about using a large
reserved region as a shared memory mechanism.

But we certainly need to allow the BIOS to define the regions it needs
to know about.


I think it should be easy to accommodate using the scheme I am
suggesting. It would require some basic testing for conflicts in the
BIOS, but otherwise it should pretty much allow you to specify any
region you want as a reserved block.

Only problem is that we don't really have a way to pass back info
saying 'you messed up trying to pinch an area that the BIOS wants
for itself'.

I'll take a look at it.

Cheers,
Jes
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] QEMU - provide e820 reserve through qemu_cfg

2010-01-25 Thread Alexander Graf

On 25.01.2010, at 22:05, Jes Sorensen wrote:

 On 01/25/10 21:14, Anthony Liguori wrote:
 On 01/25/2010 02:04 PM, Alexander Graf wrote:
 Yes, sounds good. Should be fairly extensible then. What about memory
 holes? Do we need to take care of them?
 
 It would be nice for QEMU to be able to add additional e820 regions that
 don't necessarily fit the standard layout model.
 
 For instance, I've thought a number of times about using a large
 reserved region as a shared memory mechanism.
 
 But we certainly need to allow the BIOS to define the regions it needs
 to know about.
 
 I think it should be easy to accommodate using the scheme I am
 suggesting. It would require some basic testing for conflicts in the
 BIOS, but otherwise it should pretty much allow you to specify any
 region you want as a reserved block.
 
 Only problem is that we don't really have a way to pass back info
 saying 'you messed up trying to pinch an area that the BIOS wants
 for itself'.

Eh - the BIOS shouldn't even try to use regions that are declared as reserved 
using this interface.
I guess we're mostly talking about DMI and ACPI tables. They can be anywhere in 
RAM.

Alex--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] QEMU - provide e820 reserve through qemu_cfg

2010-01-25 Thread Jes Sorensen

On 01/25/10 22:08, Alexander Graf wrote:


On 25.01.2010, at 22:05, Jes Sorensen wrote:

Only problem is that we don't really have a way to pass back info
saying 'you messed up trying to pinch an area that the BIOS wants
for itself'.


Eh - the BIOS shouldn't even try to use regions that are declared as reserved 
using this interface.
I guess we're mostly talking about DMI and ACPI tables. They can be anywhere in 
RAM.


What I had in mind with the above was the situation where a user tries
to reserve a region that is hardcoded into the BIOS, such as the address
of the BIOS text/data etc.

I don't think it would be a real problem anyway, if some user wants to
play with it, they have to take the risk of shooting themself in the
foot :)

Cheers,
Jes

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RFC: alias rework

2010-01-25 Thread Marcelo Tosatti
On Mon, Jan 25, 2010 at 10:40:32PM +0200, Izik Eidus wrote:
 On Mon, 25 Jan 2010 18:20:39 -0200
 Marcelo Tosatti mtosa...@redhat.com wrote:
 
  With current code, if a memslot is deleted, access through any aliases
  that use it will fail (BTW it looks this is not properly handled, but
  thats a separate problem).
 
 
 Yea I had some still open concerns about this code (this why I sent it on RFC)
 
  
  So AFAICS there is no requirement for an alias to continue operable 
  if its parent memslot is deleted.
 
 
 With this patch alias will stop to opearte when the parent is deleted
 just like the behivor with the current code...
 
 base_gfn will be set to 0 and npages will be set to 0 as well
 (the true values wil be hide in real_base_gfn...), so gfn_to_memslot
 and gfn_to_page will fail

But you adjust the alias (and keep it valid) if dirty logging is
enabled?

  
  Or is this a feature you need?
 
 
 I dont need it (I asked Avi to do something), So he said he want to nuke the 
 aliasing
 from kvm and keep supporting the old userspace`s

With feature i meant keeping the alias around when parent slot is
deleted.

 Do you have any other way to achive this?

No.

 Btw I do realize it might be better not to push this patch and just keep the 
 old
 way of treating aliasing as we have now, I really don`t mind.
 
  
  Motivation is that nukeing aliases is simpler than adjusting them.
  
 
 Agree.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: mark segments accessed on HW task switch

2010-01-25 Thread Marcelo Tosatti
On Mon, Jan 25, 2010 at 12:01:04PM +0200, Gleb Natapov wrote:
 On HW task switch newly loaded segments should me marked as accessed.
 
 Reported-by: Lorenzo Martignoni martig...@gmail.com
 Signed-off-by: Gleb Natapov g...@redhat.com

Applied, thanks.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] RFC: alias rework

2010-01-25 Thread Izik Eidus
On Mon, 25 Jan 2010 18:49:25 -0200
Marcelo Tosatti mtosa...@redhat.com wrote:

 On Mon, Jan 25, 2010 at 10:40:32PM +0200, Izik Eidus wrote:
  On Mon, 25 Jan 2010 18:20:39 -0200
  Marcelo Tosatti mtosa...@redhat.com wrote:
  
   With current code, if a memslot is deleted, access through any aliases
   that use it will fail (BTW it looks this is not properly handled, but
   thats a separate problem).
  
  
  Yea I had some still open concerns about this code (this why I sent it on 
  RFC)
  
   
   So AFAICS there is no requirement for an alias to continue operable 
   if its parent memslot is deleted.
  
  
  With this patch alias will stop to opearte when the parent is deleted
  just like the behivor with the current code...
  
  base_gfn will be set to 0 and npages will be set to 0 as well
  (the true values wil be hide in real_base_gfn...), so gfn_to_memslot
  and gfn_to_page will fail
 
 But you adjust the alias (and keep it valid) if dirty logging is
 enabled?

I am sorry, but probaby you got confused beacuse the code is wrong
the adjust of aliasing should happen in every case of:
 if(slot-rmap - valid (!NULL)):
 this mean we got NEW parent slot that mapped into the gfn
 that the alias is mapped to, and we want the userspace address
 of the alias slot to intersect with the new parent slot.

and the latter adjustmant of the dirty_bitmap should happen only in case
of - if(slot-dirty_bitmap - valid (!NULL)):
 the alias slot need to mark_page_dirty the bitmap of the new parent slot

I hope this will make things more clear
(I think there is another small issue there, but I will send it when it wont be 
RFC)

 
   
   Or is this a feature you need?
  
  
  I dont need it (I asked Avi to do something), So he said he want to nuke 
  the aliasing
  from kvm and keep supporting the old userspace`s
 
 With feature i meant keeping the alias around when parent slot is
 deleted.


The code doesnt try to do this, infact:
} else if (!slot-rmap) {
alias_memslot-base_gfn = 0;
alias_memslot-npages = 0;
}
came to invalidate the alias slot.

Sorry if I made to much mess :).

 
  Do you have any other way to achive this?
 
 No.
 
  Btw I do realize it might be better not to push this patch and just keep 
  the old
  way of treating aliasing as we have now, I really don`t mind.
  
   
   Motivation is that nukeing aliases is simpler than adjusting them.
   
  
  Agree.
 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-25 Thread Mark Cave-Ayland

Avi Kivity wrote:

Unfortunately msr tracing fails to record some important information.  I 
just posted a patch to fix this.  Can you rerun from kvm.git branch 
msr-trace?  That contains the fix.


Done.

http://pastebin.com/m209b1f13

Hopefully this trace should give a better indication of when the problem 
occurs since I also found the on_reboot option in libvirt - hence on 
this trace the VM shuts down immediately instead of rebooting like the 
earlier trace.



HTH,

Mark.

--
Mark Cave-Ayland - Senior Technical Architect
PostgreSQL - PostGIS
Sirius Corporation plc - control through freedom
http://www.siriusit.co.uk
t: +44 870 608 0063

Sirius Labs: http://www.siriusit.co.uk/labs
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Add definitions for current cpu models..

2010-01-25 Thread Dor Laor

On 01/25/2010 04:21 PM, Anthony Liguori wrote:

On 01/25/2010 03:08 AM, Dor Laor wrote:

qemu-config.[ch], taking a new command line that parses the argument via
QemuOpts, then passing the parsed options to a target-specific function
that then builds the table of supported cpus.

It should just be a matter of adding qemu_cpudefs_opts to

Isn't the outcome of John's patches and these configs will be exactly
the same? Since these cpu models won't ever change, there is no reason
why not to hard code them. Adding configs or command lines is a good
idea but it is more friendlier to have basic support to the common cpus.
This is why qemu today offers: -cpu ?
x86 qemu64
x86 phenom
x86 core2duo
x86 kvm64
x86 qemu32
x86 coreduo
x86 486
x86 pentium
x86 pentium2
x86 pentium3
x86 athlon
x86 n270

So bottom line, my point is to have John's base + your configs. We
need to keep also the check verb and the migration support for sending
those.

btw: IMO we should deal with this complexity ourselves and save 99.9%
of the users the need to define such models, don't ask this from a
java programmer, he is running on a JVM :-)


I'm suggesting John's base should be implemented as a default config
that gets installed by default in QEMU. The point is that a smart user
(or a downstream) can modify this to suite their needs more appropriately.

Another way to look at this is that implementing a somewhat arbitrary
policy within QEMU's .c files is something we should try to avoid.
Implementing arbitrary policy in our default config file is a fine thing
to do. Default configs are suggested configurations that are modifiable
by a user. Something baked into QEMU is something that ought to work for


If we get the models right, users and mgmt stacks won't need to define 
them. It seems like almost impossible task for us, mgmt stack/users 
won't do a better job, the opposite I guess. The configs are great, I 
have no argument against them, my case is that if we can pin down some 
definitions, its better live in the code, like the above models.
It might even help to get the same cpus across the various vendors, 
otherwise we might end up with IBM's core2duo, RH's core2duo, Suse's,..



everyone in all circumstances.

Regards,

Anthony Liguori




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCIe device pass-through - No IOMMU, Failed to deassign device error

2010-01-25 Thread Chris Wright
* Brian Jackson (i...@theiggy.com) wrote:
 On Saturday 23 January 2010 05:20:49 Yigal Korman wrote:
  I'm trying to pass a second video card to a Windows 7 virtual machine
  with KVM, and I get the following error:
 
 KVM doesn't support assigning graphics cards to VMs yet. There are people 
 working on it afaik, but I don't know the progress.

Right, so even if you figure out the issue below, there's still issue w/
the actual graphcis device funtioning properly in the guest.

  r...@ubuntu-desktop:~# kvm -cpu qemu64 -hda /dev/sdb -cdrom /dev/cdrom
  -boot order=dc -m 2000 -usb -name Win7x64 -enable-kvm -device
  pci-assign,host=80:00.0
  No IOMMU found.  Unable to assign device (null)
  Failed to deassign device (null) : Invalid argument
  Error initializing device pci-assign
  
  Now it look like I don't have VT-d, but I do, here is my cpuinfo:
  

VT-d is a chipset feature, not a CPU feature.

snip
  I've enabled vt-d in the BIOS, and added this parameter to the kernel:
  intel_iommu=on

Again, VT (or VT-x) isn't the same as VT-d.  So to be sure, you can
grep dmesg for DMAR and IOMMU to verify that the chipset actually has
VT-d support, that it's enabled, and that it's not broken (there are
quite a few broken BIOS out there that case the IOMMU to be unusable).

thanks,
-chris
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Seabios - read e820 reserve from qemu_cfg

2010-01-25 Thread Kevin O'Connor
On Mon, Jan 25, 2010 at 05:46:42PM +0100, Jes Sorensen wrote:
 Hi,
 
 Right now KVM/QEMU relies on hard coded values in Seabios for the
 reserved area for the TSS pages and the EPT page.
 
 I'd like to suggest we change this to pass the value from QEMU via
 qemu-cfg making it possible to move it around dynamically in the future.
 
 Attached is a patch to Seabios for this, which defaults to the current
 hard coded value if no value is provided by qemu-cfg. We can remove
 the backwards compatibility later.
 
 I'll post the QEMU patches for upstream QEMU and QEMU-KVM in a minute.
 
 Comments most welcome!

I like the idea, but I think it would be better to pass a list of e820
entries explicitly.  That is, pass an array of:

struct e820entry {
u64 start;
u64 size;
u32 type;
};

where 'type' uses the standard e820 definitions.  That way, SeaBIOS
can just walk through the list and add them to its e820 map.  BTW,
this is what SeaBIOS does when running under coreboot (coreboot passes
a memory map as part of the coreboot tables).  SeaBIOS is already
smart enough to not use any high memory addresses marked as reserved.

-Kevin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCIe device pass-through - No IOMMU, Failed to deassign device error

2010-01-25 Thread Kenni Lund
2010/1/26 Chris Wright chr...@sous-sol.org:

 Again, VT (or VT-x) isn't the same as VT-d.  So to be sure, you can
 grep dmesg for DMAR and IOMMU to verify that the chipset actually has
 VT-d support, that it's enabled, and that it's not broken (there are
 quite a few broken BIOS out there that case the IOMMU to be unusable).

dmesg | egrep (DMAR|IOMMU)
This information should _really_ be added to the wiki at
http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM

Knowing this, it's quite easy for a user to determine if his system
has VT-d support, _before_ following the guide, compiling own kernel,
setting up qemu-kvm, unbinding and rebinding PCI devices, just to have
qemu-kvm 0.12.2 tell him that the system has no IOMMU (much better
than 0.12.1, agreed, but it's a bit late in the process to find out
:))

Can someone with write permissions to the wiki please add this?

Best Regards
Kenni Lund
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCIe device pass-through - No IOMMU, Failed to deassign device error

2010-01-25 Thread Kenni Lund
2010/1/26 Kenni Lund ke...@kelu.dk:
 2010/1/26 Chris Wright chr...@sous-sol.org:

 Again, VT (or VT-x) isn't the same as VT-d.  So to be sure, you can
 grep dmesg for DMAR and IOMMU to verify that the chipset actually has
 VT-d support, that it's enabled, and that it's not broken (there are
 quite a few broken BIOS out there that case the IOMMU to be unusable).

 dmesg | egrep (DMAR|IOMMU)
 This information should _really_ be added to the wiki at
 http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM

 Knowing this, it's quite easy for a user to determine if his system
 has VT-d support, _before_ following the guide, compiling own kernel,
 setting up qemu-kvm, unbinding and rebinding PCI devices, just to have
 qemu-kvm 0.12.2 tell him that the system has no IOMMU (much better
 than 0.12.1, agreed, but it's a bit late in the process to find out
 :))

Doh, I didn't consider if the kernel compilation probably were needed
to give any output - nevertheless, I still think this should be added
to the wiki, even if it's the case. Perhaps a short text describing
what you should look for.

Best Regards
Kenni Lund
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCIe device pass-through - No IOMMU, Failed to deassign device error

2010-01-25 Thread Chris Wright
* Kenni Lund (ke...@kelu.dk) wrote:
 2010/1/26 Kenni Lund ke...@kelu.dk:
  2010/1/26 Chris Wright chr...@sous-sol.org:
 
  Again, VT (or VT-x) isn't the same as VT-d.  So to be sure, you can
  grep dmesg for DMAR and IOMMU to verify that the chipset actually has
  VT-d support, that it's enabled, and that it's not broken (there are
  quite a few broken BIOS out there that case the IOMMU to be unusable).
 
  dmesg | egrep (DMAR|IOMMU)
  This information should _really_ be added to the wiki at
  http://www.linux-kvm.org/page/How_to_assign_devices_with_VT-d_in_KVM
 
  Knowing this, it's quite easy for a user to determine if his system
  has VT-d support, _before_ following the guide, compiling own kernel,
  setting up qemu-kvm, unbinding and rebinding PCI devices, just to have
  qemu-kvm 0.12.2 tell him that the system has no IOMMU (much better
  than 0.12.1, agreed, but it's a bit late in the process to find out
  :))
 
 Doh, I didn't consider if the kernel compilation probably were needed
 to give any output - nevertheless, I still think this should be added
 to the wiki, even if it's the case. Perhaps a short text describing
 what you should look for.

Sure, I added a short snippet.

thanks,
-chris
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Can KVM PassThrough specifically my PCI cards to fully-virt'd KVM Guests with my CPU? Yet?

2010-01-25 Thread Ben DJ
Hi,

I have a box with an AMD Phenom II X4 920 CPU

Reading http://www.linux-kvm.org/page/FAQ#What_do_I_need_to_use_KVM.3F,
I've verified with 'cat /proc/cpuinfo' that the CPU has the AMD-V
svm extension.

I'm specifically interested in whether or not this CPU's capabilities
will allow PCI PassThrough of hardware from the Host to the Guest.
I've read the KVM 'ToDo'  'FAQ', and tbh am unclear if everything I
need is actually in KVM already.  It's just that I don't get all the
terminology yet.

I've read that VideoCards are still a no-go. NP for me, atm.

The cards I'm interested in are:

 -- a SiliconImage 3124 (sil24 module) based SATA card -- RAID
capable, but I'm only using it to attach drives, and doing the RAID
with Linux's 'md', and,
 -- A number of Gigabit NICs, including an Intel e1000 card.

I'm thoroughly confused as to whether or not I can PassThrough these
cards using this CPU, /or if I need AMD-Vi/IOMMU in hardware.  I
can't figure out if I do or don't ... AMD's product sheets have me
baffled -- and I haven't figured out if /proc/cpuinfo etc 'shows' me
definitively.

So, let's just ask this:  *can* I do hardware PassThrough of these PCI
cards from KVM Host to a fully virtualized KVM guest?  and, if 'yes',
is there a specific/minimum kernel version I need?

Thanks!

BenDJ
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Autotest PATCH] KVM-test: Add a subtest 'qemu_img'

2010-01-25 Thread Yolkfull Chow
This is designed to test all subcommands of 'qemu-img' however
so far 'commit' is not implemented.

* For 'check' subcommand test, it will 'dd' to create a file with specified
size and see whether it's supported to be checked. Then convert it to be
supported formats (qcow2 and raw so far) to see whether there's error
after convertion.

* For 'convert' subcommand test, it will convert both to 'qcow2' and 'raw' from
the format specified in config file. And only check 'qcow2' after convertion.

* For 'snapshot' subcommand test, it will create two snapshots and list them.
Finally delete them if no errors found.

* For 'info' subcommand test, it simply get output from specified image file.

Signed-off-by: Yolkfull Chow yz...@redhat.com
---
 client/tests/kvm/tests/qemu_img.py |  155 
 client/tests/kvm/tests_base.cfg.sample |   36 
 2 files changed, 191 insertions(+), 0 deletions(-)
 create mode 100644 client/tests/kvm/tests/qemu_img.py

diff --git a/client/tests/kvm/tests/qemu_img.py 
b/client/tests/kvm/tests/qemu_img.py
new file mode 100644
index 000..1ae04f0
--- /dev/null
+++ b/client/tests/kvm/tests/qemu_img.py
@@ -0,0 +1,155 @@
+import os, logging, commands
+from autotest_lib.client.common_lib import error
+import kvm_vm
+
+
+def run_qemu_img(test, params, env):
+
+`qemu-img' functions test:
+1) Judge what subcommand is going to be tested
+2) Run subcommand test
+
+@param test: kvm test object
+@param params: Dictionary with the test parameters
+@param env: Dictionary with test environment.
+
+cmd = params.get(qemu_img_binary)
+subcommand = params.get(subcommand)
+image_format = params.get(image_format)
+image_name = kvm_vm.get_image_filename(params, test.bindir)
+
+def check(img):
+global cmd
+cmd +=  check %s % img
+logging.info(Checking image '%s'... % img)
+s, o = commands.getstatusoutput(cmd)
+if not (s == 0 or does not support checks in o):
+return (False, o)
+return (True, )
+
+# Subcommand 'qemu-img check' test
+# This tests will 'dd' to create a specified size file, and check it.
+# Then convert it to supported image_format in each loop and check again.
+def check_test():
+size = params.get(dd_image_size)
+test_image = params.get(dd_image_name)
+create_image_cmd = params.get(create_image_cmd)
+create_image_cmd = create_image_cmd % (test_image, size)
+s, o = commands.getstatusoutput(create_image_cmd)
+if s != 0:
+raise error.TestError(Failed command: %s; Output is: %s %
+ (create_image_cmd, o))
+s, o = check(test_image)
+if not s:
+raise error.TestFail(Failed to check image '%s' with error: %s %
+  (test_image, o))
+for fmt in params.get(supported_image_formats).split():
+output_image = test_image + .%s % fmt
+convert(fmt, test_image, output_image)
+s, o = check(output_image)
+if not s:
+raise error.TestFail(Check image '%s' got error: %s %
+ (output_image, o))
+commands.getoutput(rm -f %s % output_image)
+commands.getoutput(rm -f %s % test_image)
+
+#Subcommand 'qemu-img create' test
+def create_test():
+global cmd
+cmd +=  create
+if params.get(encrypted) == yes:
+cmd +=  -e
+if params.get(base_image):
+cmd +=  -F %s -b %s % (params.get(base_image_format),
+ params.get(base_image))
+format = params.get(image_format)
+cmd +=  -f %s % format
+image_name_test = os.path.join(test.bindir,
+  params.get(image_name_test)) + '.' + format
+cmd +=  %s %s % (image_name_test, params.get(image_size_test))
+s, o = commands.getstatusoutput(cmd)
+if s != 0:
+raise error.TestFail(Create image '%s' failed: %s %
+(image_name_test, o))
+commands.getoutput(rm -f %s % image_name_test)
+
+def convert(output_format, image_name, output_filename,
+format=None, compressed=no, encrypted=no):
+global cmd
+cmd +=  convert
+if compressed == yes:
+cmd +=  -c
+if encrypted == yes:
+cmd +=  -e
+if format:
+cmd +=  -f %s % image_format
+cmd +=  -O %s % params.get(dest_image_format)
+cmd +=  %s %s % (image_name, output_filename)
+s, o = commands.getstatusoutput(cmd)
+if s != 0:
+raise error.TestFail(Image converted failed; Command: %s;
+ Output is: %s % (cmd, o))
+
+#Subcommand 'qemu-img convert' test
+def 

Re: Can KVM PassThrough specifically my PCI cards to fully-virt'd KVM Guests with my CPU? Yet?

2010-01-25 Thread Brian Jackson
On Monday 25 January 2010 21:11:12 Ben DJ wrote:
 Hi,
 
 I have a box with an AMD Phenom II X4 920 CPU
 
 Reading http://www.linux-kvm.org/page/FAQ#What_do_I_need_to_use_KVM.3F,
 I've verified with 'cat /proc/cpuinfo' that the CPU has the AMD-V
 svm extension.
 
 I'm specifically interested in whether or not this CPU's capabilities
 will allow PCI PassThrough of hardware from the Host to the Guest.
 I've read the KVM 'ToDo'  'FAQ', and tbh am unclear if everything I
 need is actually in KVM already.  It's just that I don't get all the
 terminology yet.
 
 I've read that VideoCards are still a no-go. NP for me, atm.
 
 The cards I'm interested in are:
 
  -- a SiliconImage 3124 (sil24 module) based SATA card -- RAID
 capable, but I'm only using it to attach drives, and doing the RAID
 with Linux's 'md', and,
  -- A number of Gigabit NICs, including an Intel e1000 card.
 
 I'm thoroughly confused as to whether or not I can PassThrough these
 cards using this CPU, /or if I need AMD-Vi/IOMMU in hardware.  I
 can't figure out if I do or don't ... AMD's product sheets have me
 baffled -- and I haven't figured out if /proc/cpuinfo etc 'shows' me
 definitively.

You do need iommu support in your system. Unfortunately there are very few AMD 
motherboards that have an iommu. Only 1 server level board I know of has one 
and is close to hitting the markets. So chances are you don't have one.

 
 So, let's just ask this:  *can* I do hardware PassThrough of these PCI
 cards from KVM Host to a fully virtualized KVM guest?  and, if 'yes',
 is there a specific/minimum kernel version I need?
 
 Thanks!
 
 BenDJ
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WinXP virtual crashes on 0.12.1.2 but not 0.12.1.1

2010-01-25 Thread Avi Kivity

On 01/26/2010 12:25 AM, Mark Cave-Ayland wrote:

Avi Kivity wrote:

Unfortunately msr tracing fails to record some important 
information.  I just posted a patch to fix this.  Can you rerun from 
kvm.git branch msr-trace?  That contains the fix.


Done.

http://pastebin.com/m209b1f13

Hopefully this trace should give a better indication of when the 
problem occurs since I also found the on_reboot option in libvirt - 
hence on this trace the VM shuts down immediately instead of rebooting 
like the earlier trace.





Unfortunately, no such luck.  Apparently this is not msr/cpuid related - 
perhaps power management.  Can you enable the kvm_mmio and kvm_pio 
events?  Perhaps they will provide a clue.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: PCIe device pass-through - No IOMMU, Failed to deassign device error

2010-01-25 Thread Avi Kivity

On 01/26/2010 03:11 AM, Kenni Lund wrote:


Can someone with write permissions to the wiki please add this?
   


Everyone has write permissions, you just need an account.

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can KVM PassThrough specifically my PCI cards to fully-virt'd KVM Guests with my CPU? Yet?

2010-01-25 Thread Ben DJ
On Mon, Jan 25, 2010 at 9:44 PM, Brian Jackson i...@theiggy.com wrote:
 You do need iommu support in your system. Unfortunately there are very few AMD
 motherboards that have an iommu. Only 1 server level board I know of has one
 and is close to hitting the markets. So chances are you don't have one.

Well, rats.  Thx for a clear answer, though!

And, there's no iommu emulation in software for KVM that'd do it?

BenDJ
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] QEMU - provide e820 reserve through qemu_cfg

2010-01-25 Thread Gleb Natapov
On Mon, Jan 25, 2010 at 06:13:35PM +0100, Jes Sorensen wrote:
 On 01/25/10 17:58, Alexander Graf wrote:
 Howdy. Congratulations to the new mail address - looks neat ;-).
 
 :-)
 
 Two comments:
 
 1) I don't see how passing a single region is any help. I'd rather like to 
 see a device tree like table structure
 You'd get one variable for len of the table, one with the contents. So for a 
 universal reserved region specifier you'd get:
 
 u64 baseu64 len
 
 Then have len=2 and put data in the table:
 
 u64 base1u64 len1u64 base2u64 len2
 
 That way we'd get 2 entries and the chance to enhance them later on. In 
 fact, it might even make sense to pass the whole table in such a form. That 
 way qemu generates all of the e820 tables and we can declare whatever we 
 want. Just add a type field in the table.
 
 I am fine with having QEMU build the e820 tables completely if there is
 a consensus to take that path.
 
QEMU can't build the e820 map completely. There are things it doesn't
know. Like how much memory ACPI tables take and where they are located.

 2) Please inline patches. They showed up as attachments here, making them 
 really hard to comment on.
 
 Sorry Thunderbug doesn't do that well, but they should be attached as
 txt?
 
 Cheers,
 Jes
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM call agenda for Jan 26

2010-01-25 Thread Chris Wright
Please send in any agenda items you are interested in covering.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can KVM PassThrough specifically my PCI cards to fully-virt'd KVM Guests with my CPU? Yet?

2010-01-25 Thread Brian Jackson
On Tuesday 26 January 2010 00:22:25 Ben DJ wrote:
 On Mon, Jan 25, 2010 at 9:44 PM, Brian Jackson i...@theiggy.com wrote:
  You do need iommu support in your system. Unfortunately there are very
  few AMD motherboards that have an iommu. Only 1 server level board I know
  of has one and is close to hitting the markets. So chances are you don't
  have one.
 
 Well, rats.  Thx for a clear answer, though!
 
 And, there's no iommu emulation in software for KVM that'd do it?

Nope. When support was being developed, there was, but it was never merged, 
and I highly doubt the patches would be remotely able to be applied at this 
point with all the code churn qemu has had.


 
 BenDJ
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] kvmppc/booke: Set ESR and DEAR when inject interrupt to guest

2010-01-25 Thread Liu Yu-B13201
 

 -Original Message-
 From: Alexander Graf [mailto:ag...@suse.de] 
 Sent: Friday, January 22, 2010 7:33 PM
 To: Liu Yu-B13201
 Cc: hol...@penguinppc.org; kvm-ppc@vger.kernel.org; 
 k...@vger.kernel.org
 Subject: Re: [PATCH] kvmppc/booke: Set ESR and DEAR when 
 inject interrupt to guest
 
 
 On 22.01.2010, at 12:27, Liu Yu-B13201 wrote:
 
  
  
  -Original Message-
  From: kvm-ppc-ow...@vger.kernel.org 
  [mailto:kvm-ppc-ow...@vger.kernel.org] On Behalf Of Alexander Graf
  Sent: Friday, January 22, 2010 7:13 PM
  To: Liu Yu-B13201
  Cc: hol...@penguinppc.org; kvm-ppc@vger.kernel.org; 
  k...@vger.kernel.org
  Subject: Re: [PATCH] kvmppc/booke: Set ESR and DEAR when 
  inject interrupt to guest
  
  
  On 22.01.2010, at 11:54, Liu Yu wrote:
  
  Old method prematurely sets ESR and DEAR.
  Move this part after we decide to inject interrupt,
  and make it more like hardware behave.
  
  Signed-off-by: Liu Yu yu@freescale.com
  ---
  arch/powerpc/kvm/booke.c   |   24 ++--
  arch/powerpc/kvm/emulate.c |2 --
  2 files changed, 14 insertions(+), 12 deletions(-)
  
  @@ -286,15 +295,12 @@ int kvmppc_handle_exit(struct kvm_run 
  *run, struct kvm_vcpu *vcpu,
break;
  
case BOOKE_INTERRUPT_DATA_STORAGE:
  - vcpu-arch.dear = vcpu-arch.fault_dear;
  - vcpu-arch.esr = vcpu-arch.fault_esr;
kvmppc_booke_queue_irqprio(vcpu, 
  BOOKE_IRQPRIO_DATA_STORAGE);
  
  kvmppc_booke_queue_data_storage(vcpu, vcpu-arch.fault_esr, 
  vcpu-arch.fault_dear);
  
kvmppc_account_exit(vcpu, DSI_EXITS);
r = RESUME_GUEST;
break;
  
case BOOKE_INTERRUPT_INST_STORAGE:
  - vcpu-arch.esr = vcpu-arch.fault_esr;
kvmppc_booke_queue_irqprio(vcpu, 
  BOOKE_IRQPRIO_INST_STORAGE);
  
  kvmppc_booke_queue_inst_storage(vcpu, vcpu-arch.fault_esr);
  
  
  Not sure if this is redundant, as we already have fault_esr.
  Or should we ignore what hareware is and create a new esr to guest?
 
 On Book3S I take the SRR1 we get from the host as 
 inspiration of what to pass to the guest as SRR1. I think 
 we should definitely be able to inject a fault that we didn't 
 get in that exact form from the exit path.
 
 I'm also not sure if something could clobber fault_esr if 
 another interrupt takes precedence. Say a #MC.

No as far as I know.
And if yes, the clobber could as well happen before we copy it.
Hollis, what do you think we should do here?


--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html