Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Pekka Enberg
Avi Kivity wrote: The instruction at 0x28 is enabling paging, next insn fetch faults, so the paging structures must be incorrect. Questions: - what is the u64 at cr3? (call it pte4) - what is the u64 at (pte4 ~0xfff)? (call it pte3) - what is the u64 at (pte3 ~0xfff)? (pte2) - what is the

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Avi Kivity
On 04/11/2010 09:30 AM, Pekka Enberg wrote: Avi Kivity wrote: The instruction at 0x28 is enabling paging, next insn fetch faults, so the paging structures must be incorrect. Questions: - what is the u64 at cr3? (call it pte4) - what is the u64 at (pte4 ~0xfff)? (call it pte3) - what is the

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Pekka Enberg
Avi Kivity wrote: Sorry for the delay. Here you go: Page Tables: pte4: 02403007 pte3: 02404007 pte2: 0183 These are all correct. The only thing I can think of, is that MAXPHYADDR is small value. And indeed, if I run it on an ept capable machine (which does

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Avi Kivity
On 04/11/2010 12:48 PM, Pekka Enberg wrote: So the guest is in long mode, happily trying to access pci config space. MAXPHYADDR comes from cpuid 8008.eax[0:7]. Typical values are 36-40 (number of physical address bits supported by the processor). What value does your guest see? Ah,

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Pekka Enberg
Avi Kivity wrote: On 04/11/2010 12:48 PM, Pekka Enberg wrote: So the guest is in long mode, happily trying to access pci config space. MAXPHYADDR comes from cpuid 8008.eax[0:7]. Typical values are 36-40 (number of physical address bits supported by the processor). What value does your

[PATCH 1/2] KVM: x86 emulator: Don't overwrite decode cache

2010-04-11 Thread Avi Kivity
Currently if we an instruction spans a page boundary, when we fetch the second half we overwrite the first half. This prevents us from tracing the full instruction opcodes. Fix by appending the second half to the first. Signed-off-by: Avi Kivity a...@redhat.com --- arch/x86/kvm/emulate.c |

[PATCH v2 0/2] Trace emulated instrucions

2010-04-11 Thread Avi Kivity
Add a trace of instruction emulation into ftrace. This can help analyze performance issues, or, in the case of failed emulation, identify the missing instructions. v2: - trace all emulation starts - add missing statistic increment on failure (long term we need to get rid of those

[PATCH 2/2] KVM: Trace emulated instructions

2010-04-11 Thread Avi Kivity
Log emulated instructions in ftrace, especially if they failed. Signed-off-by: Avi Kivity a...@redhat.com --- arch/x86/kvm/trace.h | 86 ++ arch/x86/kvm/x86.c |4 ++ 2 files changed, 90 insertions(+), 0 deletions(-) diff --git

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Avi Kivity
On 04/11/2010 01:02 PM, Pekka Enberg wrote: It should work without 8008 set up - failure should happen only if it is setup incorrectly: int cpuid_maxphyaddr(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; best = kvm_find_cpuid_entry(vcpu, 0x8008, 0); if (best)

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Pekka Enberg
Avi Kivity wrote: On 04/11/2010 01:02 PM, Pekka Enberg wrote: It should work without 8008 set up - failure should happen only if it is setup incorrectly: int cpuid_maxphyaddr(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best; best = kvm_find_cpuid_entry(vcpu, 0x8008, 0);

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Avi Kivity
On 04/11/2010 02:52 PM, Pekka Enberg wrote: Do you have a function 8, though? Looks like a bug in kvm may confuse the two. Yeah, the host has function 8. I'm more than happy to test patches to fix the problem. Coming up after a quick git blame to see if I can see how the bug was

Re: Problem with KVM guest switching to x86 long mode

2010-04-11 Thread Avi Kivity
On 04/11/2010 03:02 PM, Avi Kivity wrote: On 04/11/2010 02:52 PM, Pekka Enberg wrote: Do you have a function 8, though? Looks like a bug in kvm may confuse the two. Yeah, the host has function 8. I'm more than happy to test patches to fix the problem. Coming up after a quick git blame to

[PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Avi Kivity
MAXPHYADDR is derived from cpuid 0x8008, but when that isn't present, we get some random value. Fix by checking first that cpuid 0x8008 is supported. Pekka Enberg penb...@cs.helsinki.fi Signed-off-by: Avi Kivity a...@redhat.com --- arch/x86/kvm/x86.c |4 1 files changed, 4

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Avi Kivity
On 04/11/2010 03:33 PM, Avi Kivity wrote: MAXPHYADDR is derived from cpuid 0x8008, but when that isn't present, we get some random value. Fix by checking first that cpuid 0x8008 is supported. Pekka Enbergpenb...@cs.helsinki.fi ^ += Reported-by: (looking forward to Tested-by: too)

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Pekka Enberg
Avi Kivity wrote: MAXPHYADDR is derived from cpuid 0x8008, but when that isn't present, we get some random value. Fix by checking first that cpuid 0x8008 is supported. Pekka Enberg penb...@cs.helsinki.fi Signed-off-by: Avi Kivity a...@redhat.com --- arch/x86/kvm/x86.c |4 1

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Avi Kivity
On 04/11/2010 04:32 PM, Pekka Enberg wrote: Avi Kivity wrote: MAXPHYADDR is derived from cpuid 0x8008, but when that isn't present, we get some random value. Fix by checking first that cpuid 0x8008 is supported. Pekka Enberg penb...@cs.helsinki.fi Signed-off-by: Avi Kivity

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Pekka Enberg
Avi Kivity wrote: Hmm, doesn't seem to work here. I still that triple fault in guest. Can you add a printk to see what value is returned and why? Argh, it's a off-by one bug in my userspace tool... So the CPU really does support 0x8008 and I'm just an idiot. :-)

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Pekka Enberg
Pekka Enberg wrote: Avi Kivity wrote: Hmm, doesn't seem to work here. I still that triple fault in guest. Can you add a printk to see what value is returned and why? Argh, it's a off-by one bug in my userspace tool... So the CPU really does support 0x8008 and I'm just an idiot. :-)

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Avi Kivity
On 04/11/2010 04:45 PM, Pekka Enberg wrote: Pekka Enberg wrote: Avi Kivity wrote: Hmm, doesn't seem to work here. I still that triple fault in guest. Can you add a printk to see what value is returned and why? Argh, it's a off-by one bug in my userspace tool... So the CPU really does

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Pekka Enberg
Avi Kivity wrote: On 04/11/2010 04:45 PM, Pekka Enberg wrote: Pekka Enberg wrote: Avi Kivity wrote: Hmm, doesn't seem to work here. I still that triple fault in guest. Can you add a printk to see what value is returned and why? Argh, it's a off-by one bug in my userspace tool... So the

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Avi Kivity
On 04/11/2010 04:53 PM, Pekka Enberg wrote: Avi Kivity wrote: On 04/11/2010 04:45 PM, Pekka Enberg wrote: Pekka Enberg wrote: Avi Kivity wrote: Hmm, doesn't seem to work here. I still that triple fault in guest. Can you add a printk to see what value is returned and why? Argh, it's a

Re: [PATCH] KVM: Fix MAXPHYADDR calculation when cpuid does not support it

2010-04-11 Thread Pekka Enberg
Avi Kivity wrote: OK, then it's a bug of my own doing and we don't need to do anything in the kernel. I think the patch is nevertheless correct, not sure why it worked so far. Yes, agreed. I'm guessing most 64-bit CPUs support 0x8008 and qemu does the right thing so the bug doesn't

Re: Networkconfiguration with KVM

2010-04-11 Thread Dan Johansson
On Monday 05 April 2010 22.04:17 Held Bernhard wrote: Hi Dan! This should be done over the host-eth3 interface and I have set up the br-eth3 and qtap3 the same way as with the eth1/br-eth1/qtap1 with one difference - the br-eth3 interface is setup without an IP. When doing traffic from

Re: [PATCH] vhost: Make it more scalable by creating a vhost thread per device.

2010-04-11 Thread Michael S. Tsirkin
On Thu, Apr 08, 2010 at 05:05:42PM -0700, Sridhar Samudrala wrote: On Mon, 2010-04-05 at 10:35 -0700, Sridhar Samudrala wrote: On Sun, 2010-04-04 at 14:14 +0300, Michael S. Tsirkin wrote: On Fri, Apr 02, 2010 at 10:31:20AM -0700, Sridhar Samudrala wrote: Make vhost scalable by creating a

Re: [PATCH RFC 1/5] KVM: introduce a set_bit function for bitmaps in user space

2010-04-11 Thread Avi Kivity
On 04/09/2010 12:30 PM, Takuya Yoshikawa wrote: This work is initially suggested by Avi Kivity for moving the dirty bitmaps used by KVM to user space: This makes it possible to manipulate the bitmaps from qemu without copying from KVM. Note: We are now brushing up this code before sending to

Re: [PATCH RFC 2/5] KVM: use a rapper function to calculate the sizes of dirty bitmaps

2010-04-11 Thread Avi Kivity
On 04/09/2010 12:32 PM, Takuya Yoshikawa wrote: We will use this later in other parts. s/rapper/wrapper/... +static inline int kvm_dirty_bitmap_bytes(struct kvm_memory_slot *memslot) +{ + return ALIGN(memslot-npages, BITS_PER_LONG) / 8; +} + 'int' may overflow. struct

Re: [PATCH RFC 3/5] KVM: Use rapper functions to create and destroy dirty bitmaps

2010-04-11 Thread Avi Kivity
On 04/09/2010 12:34 PM, Takuya Yoshikawa wrote: For x86, we will change the allocation and free parts to do_mmap() and do_munmap(). This patch makes it cleaner. Should be done for all architectures. I don't want different ways of creating dirty bitmaps for different architectures. --

Re: [PATCH RFC 4/5] KVM: add new members to the memory slot for double buffering of bitmaps

2010-04-11 Thread Avi Kivity
On 04/09/2010 12:35 PM, Takuya Yoshikawa wrote: Currently, x86 vmalloc()s a dirty bitmap every time when we swich to the next dirty bitmap. To avoid this, we use the double buffering technique: we also move the bitmaps to userspace, so that extra bitmaps will not use the precious kernel

Re: [PATCH RFC 5/5] KVM: This is the main part of the moving dirty bitmaps to user space

2010-04-11 Thread Avi Kivity
On 04/09/2010 12:38 PM, Takuya Yoshikawa wrote: By this patch, bitmap allocation is replaced with do_mmap() and bitmap manipulation is replaced with *_user() functions. Note that this does not change the APIs between kernel and user space. To get more advantage from this hack, we need to add a

[PATCH] [qemu-kvm/stable] fix CPUID vendor override

2010-04-11 Thread Andre Przywara
the meaning of vendor_override is actually the opposite of how it is currently used :-( This fix reverts the workaround 4dad7ff3 and replaces it with the correct version. Fix it to allow KVM to export the non-native CPUID vendor if explicitly requested by the user. Signed-off-by: Andre Przywara

[PATCH] [qemu-kvm] fix CPUID vendor override

2010-04-11 Thread Andre Przywara
the meaning of vendor_override is actually the opposite of how it is currently used :-( This fix reverts the workaround 4dad7ff3 and replaces it with the correct version. Fix it to allow KVM to export the non-native CPUID vendor if explicitly requested by the user. Signed-off-by: Andre Przywara

[PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Andre Przywara
On SVM we set the instruction length of skipped instructions to hard-coded, well known values, which could be wrong when (bogus, but valid) prefixes (REX, segment override) are used. Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or AthlonII) have an explicit NEXTRIP field in the VMCB

Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Alexander Graf
On 11.04.2010, at 23:07, Andre Przywara wrote: On SVM we set the instruction length of skipped instructions to hard-coded, well known values, which could be wrong when (bogus, but valid) prefixes (REX, segment override) are used. Newer AMD processors (Fam10h 45nm and better, aka. PhenomII or

Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Alexander Graf
On 11.04.2010, at 23:40, Alexander Graf wrote: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb if the CPU supports it */ static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add) { if (svm-vmcb-control.next_rip != 0) In

Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Andre Przywara
Alexander Graf wrote: On 11.04.2010, at 23:40, Alexander Graf wrote: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb if the CPU supports it */ static u64 svm_next_rip(struct kvm_vcpu *vcpu, int add) { if

Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Alexander Graf
On 11.04.2010, at 23:51, Andre Przywara wrote: Alexander Graf wrote: On 11.04.2010, at 23:40, Alexander Graf wrote: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb if the CPU supports it */ static u64 svm_next_rip(struct kvm_vcpu

Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Andre Przywara
Alexander Graf wrote: On 11.04.2010, at 23:51, Andre Przywara wrote: Alexander Graf wrote: On 11.04.2010, at 23:40, Alexander Graf wrote: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb if the CPU supports it */ static u64

Re: [PATCH] svm: implement NEXTRIPsave SVM feature

2010-04-11 Thread Alexander Graf
On 12.04.2010, at 00:13, Andre Przywara wrote: Alexander Graf wrote: On 11.04.2010, at 23:51, Andre Przywara wrote: Alexander Graf wrote: On 11.04.2010, at 23:40, Alexander Graf wrote: /* Either adds offset n to the instruction counter or takes the next instruction pointer from the vmcb

Re: [Qemu-devel] [GSoC 2010] Pass-through filesystem support.

2010-04-11 Thread Jamie Lokier
Javier Guerra Giraldez wrote: On Sat, Apr 10, 2010 at 7:42 AM, Mohammed Gamal m.gamal...@gmail.com wrote: On Sat, Apr 10, 2010 at 2:12 PM, Jamie Lokier ja...@shareable.org wrote: To throw a spanner in, the most widely supported filesystem across operating systems is probably NFS, version 2

Re: [PATCH RFC 1/5] KVM: introduce a set_bit function for bitmaps in user space

2010-04-11 Thread Takuya Yoshikawa
(2010/04/12 2:08), Avi Kivity wrote: On 04/09/2010 12:30 PM, Takuya Yoshikawa wrote: This work is initially suggested by Avi Kivity for moving the dirty bitmaps used by KVM to user space: This makes it possible to manipulate the bitmaps from qemu without copying from KVM. Note: We are now

[PATCH] KVM: remove unused code in kvm_coalesced_mmio_init()

2010-04-11 Thread wzt . wzt
ret is already set as ENOMEM before, so the second ret = -ENOMEM; can be removed. Signed-off-by: Zhitong Wang zhitong.wan...@alibaba-inc.com --- virt/kvm/coalesced_mmio.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/virt/kvm/coalesced_mmio.c

Re: [PATCH RFC 2/5] KVM: use a rapper function to calculate the sizes of dirty bitmaps

2010-04-11 Thread Takuya Yoshikawa
(2010/04/12 2:12), Avi Kivity wrote: On 04/09/2010 12:32 PM, Takuya Yoshikawa wrote: We will use this later in other parts. s/rapper/wrapper/... Oh, my poor English, sorry. +static inline int kvm_dirty_bitmap_bytes(struct kvm_memory_slot *memslot) +{ + return ALIGN(memslot-npages,

[ kvm-Bugs-2976863 ] 32PAE Windows guest blue screen when booting with apci on

2010-04-11 Thread SourceForge.net
Bugs item #2976863, was opened at 2010-03-26 14:44 Message generated for change (Comment added) made by haoxudong You can respond by visiting: https://sourceforge.net/tracker/?func=detailatid=893831aid=2976863group_id=180599 Please note that this message will contain a full copy of the comment

[PATCH] KVM: Enhance the coalesced_mmio_write() parameter to avoid stack buffer overflow

2010-04-11 Thread wzt . wzt
coalesced_mmio_write() is not check the len value, if len is negative, memcpy(ring-coalesced_mmio[ring-last].data, val, len); will cause stack buffer overflow. Signed-off-by: Zhitong Wang zhitong.wan...@alibaba-inc.com --- virt/kvm/coalesced_mmio.c |4 1 files changed, 4 insertions(+),

Re: [PATCH RFC 3/5] KVM: Use rapper functions to create and destroy dirty bitmaps

2010-04-11 Thread Takuya Yoshikawa
(2010/04/12 2:13), Avi Kivity wrote: On 04/09/2010 12:34 PM, Takuya Yoshikawa wrote: For x86, we will change the allocation and free parts to do_mmap() and do_munmap(). This patch makes it cleaner. Should be done for all architectures. I don't want different ways of creating dirty bitmaps

RE: VM performance issue in KVM guests.

2010-04-11 Thread Zhang, Xiantao
Avi Kivity wrote: (copying lkml and some scheduler folk) On 04/10/2010 11:16 AM, Zhang, Xiantao wrote: Hi, all We are working on the scalability work for KVM guests, and found one big issue exists in linux scheduler and it may impact guest's performance and scalability a lot for some

Re: [PATCH RFC 4/5] KVM: add new members to the memory slot for double buffering of bitmaps

2010-04-11 Thread Takuya Yoshikawa
(2010/04/12 2:15), Avi Kivity wrote: On 04/09/2010 12:35 PM, Takuya Yoshikawa wrote: Currently, x86 vmalloc()s a dirty bitmap every time when we swich to the next dirty bitmap. To avoid this, we use the double buffering technique: we also move the bitmaps to userspace, so that extra bitmaps

Re: [PATCH RFC 5/5] KVM: This is the main part of the moving dirty bitmaps to user space

2010-04-11 Thread Takuya Yoshikawa
(2010/04/12 2:21), Avi Kivity wrote: On 04/09/2010 12:38 PM, Takuya Yoshikawa wrote: By this patch, bitmap allocation is replaced with do_mmap() and bitmap manipulation is replaced with *_user() functions. Note that this does not change the APIs between kernel and user space. To get more

latest git - main thread spinning

2010-04-11 Thread David S. Ahern
With the latest qemu-kvm.git (fresh pull today, 11-April-2010) the main qemu thread is spinning. It looks like the recent sync with qemu.git is the culprit -- specifically, d6f4ade214a9f74dca9495b83a24ff9c113e4f9a from Paolo on March 10 changed the semantics of main_loop_wait from a timeout value