Re: [Qemu-devel] [PATCH 1/2] RESEND: Add kvm_set_ioeventfd_mmio_long definition for non-KVM systems
Thanks, applied. On Sat, Aug 14, 2010 at 11:47 PM, Cam Macdonell c...@cs.ualberta.ca wrote: Signed-off-by: Cam Macdonell c...@cs.ualberta.ca --- kvm-stub.c | 5 + 1 files changed, 5 insertions(+), 0 deletions(-) diff --git a/kvm-stub.c b/kvm-stub.c index 3378bd3..d45f9fa 100644 --- a/kvm-stub.c +++ b/kvm-stub.c @@ -136,3 +136,8 @@ int kvm_set_ioeventfd_pio_word(int fd, uint16_t addr, uint16_t val, bool assign) { return -ENOSYS; } + +int kvm_set_ioeventfd_mmio_long(int fd, uint32_t adr, uint32_t val, bool assign) +{ + return -ENOSYS; +} -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] x86: Bail out on unemulated instructions
2010/8/15 Gleb Natapov g...@redhat.com: On Sun, Aug 15, 2010 at 03:40:00PM +0300, Mohammed Gamal wrote: On Sun, Aug 15, 2010 at 10:32 AM, Gleb Natapov g...@redhat.com wrote: On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote: If emulation fails due to the instruction being unemulated. Return immediately instead of restarting the instruction and infinitely trying to execute it. This is already handled correctly as far as I can see. Sometimes instruction should be retried and reexecute_instruction() checks for that case. If instruction emulation fails in big real mode re-executing instruction will be useless though, so what should be done is to make reexecute_instruction() return false if vcpu is in big real mode and cpu relies on emulation to handle it. We don't have a separate mode for big real mode. The emulation modes we have are real and vm86 That doesn't makes the patch right. So we will have to figure something out. True. Can we do it for real mode in general (i.e. X86EMUL_MODE_REAL)? Signed-off-by: Mohammed Gamal m.gamal...@gmail.com --- arch/x86/kvm/x86.c | 6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 416aa0e..a31db44 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4036,6 +4036,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu, } ++vcpu-stat.insn_emulation; + if (r == X86EMUL_UNHANDLEABLE) + return handle_emulation_failure(vcpu); + if (r) { if (reexecute_instruction(vcpu, cr2)) return EMULATE_DONE; @@ -4057,6 +4060,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu, restart: r = x86_emulate_insn(vcpu-arch.emulate_ctxt); + if (r == X86EMUL_UNHANDLEABLE) + return handle_emulation_failure(vcpu); + if (r) { /* emulation failed */ if (reexecute_instruction(vcpu, cr2)) return EMULATE_DONE; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [qemu-kvm] build fail on i386 RHEL5u4
On 08/11/2010 04:49 AM, Hao, Xudong wrote: Hi, Recently I build qemu-kvm on 32bit RHEL5u4/RHEL5u5, it will fail on fuction vhost_dev_sync_region. But RHEL5u1 system is fine to build. Did anyone meet similar issue? qemu-kvm commit: 59d71ddb432db04b57ee2658ce50a3e35d7db97e build error: ... CCx86_64-softmmu/i8254.o CCx86_64-softmmu/i8254-kvm.o CCx86_64-softmmu/device-assignment.o LINK x86_64-softmmu/qemu-system-x86_64 vhost.o: In function `vhost_dev_sync_region': /home/source/qemu-kvm/hw/vhost.c:47: undefined reference to `__sync_fetch_and_and_4' collect2: ld returned 1 exit status make[1]: *** [qemu-system-x86_64] Error 1 make: *** [subdir-x86_64-softmmu] Error 2 Appears to be a gcc bug. I opened https://bugzilla.redhat.com/show_bug.cgi?id=624279 to track this. Meanwhile, installing the gcc44 package and building with it (./configure --cc=gcc44) appears to work. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] x86: Bail out on unemulated instructions
On Sun, Aug 15, 2010 at 10:32 AM, Gleb Natapov g...@redhat.com wrote: On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote: If emulation fails due to the instruction being unemulated. Return immediately instead of restarting the instruction and infinitely trying to execute it. This is already handled correctly as far as I can see. Sometimes instruction should be retried and reexecute_instruction() checks for that case. If instruction emulation fails in big real mode re-executing instruction will be useless though, so what should be done is to make reexecute_instruction() return false if vcpu is in big real mode and cpu relies on emulation to handle it. We don't have a separate mode for big real mode. The emulation modes we have are real and vm86 Signed-off-by: Mohammed Gamal m.gamal...@gmail.com --- arch/x86/kvm/x86.c | 6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 416aa0e..a31db44 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4036,6 +4036,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu, } ++vcpu-stat.insn_emulation; + if (r == X86EMUL_UNHANDLEABLE) + return handle_emulation_failure(vcpu); + if (r) { if (reexecute_instruction(vcpu, cr2)) return EMULATE_DONE; @@ -4057,6 +4060,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu, restart: r = x86_emulate_insn(vcpu-arch.emulate_ctxt); + if (r == X86EMUL_UNHANDLEABLE) + return handle_emulation_failure(vcpu); + if (r) { /* emulation failed */ if (reexecute_instruction(vcpu, cr2)) return EMULATE_DONE; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 0/3] Real mode interrupt injection
On 08/12/2010 04:07 AM, Mohammed Gamal wrote: I was playing around with the non-atomic-injection branch. I decided to use e_i_g_s=1, and it's worth noting that I never experienced these faults with the switch enabled. Hate to spoil it. I did experience the faults again with e_i_g_s=1, although much less frequently. What is rather really strange, is that I could get a Linux guest to boot up completely both with e_i_g_s=1 and without it with the real mode interrupt patch enabled. It looks to me like the problem mainly happens when the BIOS tranfers control to the boot loader. Other guests usually fail. Would you like me to attach a trace? It will be much too big, upload it somewhere or send it to be privately. But, use the code with the interrupt injection setup fixed (see my comment to patch 2). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 0/4] Real mode interrupt injection
On Sun, Aug 15, 2010 at 3:43 PM, Avi Kivity a...@redhat.com wrote: On 08/14/2010 03:09 AM, Mohammed Gamal wrote: This patch introduces real mode interrupt injection for VMX. It currently invokes the x86 emulator to emulate interrupts instead of manually setting VMX controls. Needless to say, this is not meant for merging in its current state. The emulator still needs some more work to get this completely operational. Mohammed Gamal (3): x86 emulator: Expose emulate_int_real() x86: Add inject_realmode_interrupt() wrapper VMX: Emulated real mode interrupt injection arch/x86/include/asm/kvm_emulate.h | 3 ++- arch/x86/kvm/vmx.c | 11 +-- arch/x86/kvm/x86.c | 14 ++ arch/x86/kvm/x86.h | 1 + 4 files changed, 18 insertions(+), 11 deletions(-) --- Changes since v1: - Save emulation context eip value early in emulate_int_real() - Properly initialize emulation context in inject_realmode_interrupt() - Implement error checks on using inject_realmode_interrupt() Do those changes help your tests? To an extent. At least now the BIOS mostly runs smoothly since eip values are updated correctly. However, it looks like guests go into nowhere once things are handed over to the boot loader. So there is still many things we need to fix. I'll post a trace shortly. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] x86: Bail out on unemulated instructions
On 08/15/2010 03:43 PM, Mohammed Gamal wrote: 2010/8/15 Gleb Natapovg...@redhat.com: On Sun, Aug 15, 2010 at 03:40:00PM +0300, Mohammed Gamal wrote: On Sun, Aug 15, 2010 at 10:32 AM, Gleb Natapovg...@redhat.com wrote: On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote: If emulation fails due to the instruction being unemulated. Return immediately instead of restarting the instruction and infinitely trying to execute it. This is already handled correctly as far as I can see. Sometimes instruction should be retried and reexecute_instruction() checks for that case. If instruction emulation fails in big real mode re-executing instruction will be useless though, so what should be done is to make reexecute_instruction() return false if vcpu is in big real mode and cpu relies on emulation to handle it. We don't have a separate mode for big real mode. The emulation modes we have are real and vm86 That doesn't makes the patch right. So we will have to figure something out. True. Can we do it for real mode in general (i.e. X86EMUL_MODE_REAL)? We can do it conditionally for CPL=0. That includes real mode (and excludes vm86). However, there's a race involved (see a895e576cfd96). I don't see how we can call handle_emulation_failure() without opening the race again. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] x86: Bail out on unemulated instructions
On 08/15/2010 03:49 PM, Gleb Natapov wrote: True. Can we do it for real mode in general (i.e. X86EMUL_MODE_REAL)? If we flush all shadow pages when moving from paged mode to non paged checking for X86EMUL_MODE_REAL sounds enough to me, but Avi knows better. Or we can add is_big_real_mode() callback to x86_ops and implement it in vmx accordingly. Neither are possible. We can have one cpu in big real mode and others in paged mode, so even in real mode we cannot rule out a spurious page fault due to shadow write protection. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 2/4] x86: Add inject_realmode_interrupt() wrapper
On 08/14/2010 03:19 AM, Mohammed Gamal wrote: This adds a wrapper function inject_realmode_interrupt() around the emulator function emulate_int_real() to allow real mode interrupt injection. +EXPORT_SYMBOL_GPL(inject_realmode_interrupt); + Global symbols should start with kvm_. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Freebsd image from Qemu hangs on booting in KVM
On 08/05/2010 11:51 PM, Anjali Kulkarni wrote: Thanks Avi, I am trying to reproduce this on another setup, and do not see the issue. My understanding is to run KVM + Qemu, I only need to install KVM modules, and Qemu does not need to be modified. Is that correct? I see [r...@ipg-virt01 anjali]# lsmod | grep kvm kvm_intel 87016 0 kvm 211496 1 kvm_intel [r...@ipg-virt01 anjali]# modprobe -l 'kvm*' /lib/modules/2.6.18-164.el5/kernel/extra/kvm.ko /lib/modules/2.6.18-164.el5/kernel/extra/kvm-amd.ko /lib/modules/2.6.18-164.el5/kernel/extra/kvm-intel.ko And then I run Qemu, as I did, before installing KVM. It should just use KVM? Is there any way I can check? 'info kvm' from the monitor Btw, when it hangs, I cannot even press any key, so not sure how I can get those commands you suggest below.. alt-ctrl-2 -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: x86 emulator: put register operand write back to a function
On 08/12/2010 04:38 PM, Wei Yongjun wrote: Introduce function write_register_operand() to write back the register operand. +static void write_register_operand(struct operand *op, unsigned long val, +unsigned int bytes) +{ + /* The 4-byte case *is* correct: in 64-bit mode we zero-extend. */ + switch (bytes) { + case 1: + *(u8 *)op-addr.reg = (u8)val; + break; + case 2: + *(u16 *)op-addr.reg = (u16)val; + break; + case 4: + *op-addr.reg = (u32)val; + break; /* 64b: zero-extend */ + case 8: + *op-addr.reg = val; + break; + } +} It's cleaner to take val and bytes from struct operand, and do the assignment from the callers, no? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] x86: Bail out on unemulated instructions
On Sun, Aug 15, 2010 at 03:40:00PM +0300, Mohammed Gamal wrote: On Sun, Aug 15, 2010 at 10:32 AM, Gleb Natapov g...@redhat.com wrote: On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote: If emulation fails due to the instruction being unemulated. Return immediately instead of restarting the instruction and infinitely trying to execute it. This is already handled correctly as far as I can see. Sometimes instruction should be retried and reexecute_instruction() checks for that case. If instruction emulation fails in big real mode re-executing instruction will be useless though, so what should be done is to make reexecute_instruction() return false if vcpu is in big real mode and cpu relies on emulation to handle it. We don't have a separate mode for big real mode. The emulation modes we have are real and vm86 That doesn't makes the patch right. So we will have to figure something out. Signed-off-by: Mohammed Gamal m.gamal...@gmail.com --- arch/x86/kvm/x86.c | 6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 416aa0e..a31db44 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4036,6 +4036,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu, } ++vcpu-stat.insn_emulation; + if (r == X86EMUL_UNHANDLEABLE) + return handle_emulation_failure(vcpu); + if (r) { if (reexecute_instruction(vcpu, cr2)) return EMULATE_DONE; @@ -4057,6 +4060,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu, restart: r = x86_emulate_insn(vcpu-arch.emulate_ctxt); + if (r == X86EMUL_UNHANDLEABLE) + return handle_emulation_failure(vcpu); + if (r) { /* emulation failed */ if (reexecute_instruction(vcpu, cr2)) return EMULATE_DONE; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: system_powerdown not working for qemu-kvm 0.12.4?
On 08/15/2010 02:32 AM, Teck Choon Giam wrote: Can you try to bisect between qemu-kvm-0.12.3 and 0.12.4 to see which commit introduced the regression? Actually I am not so sure about how to do the bisecting as the below steps always produce different configure for me. Any pointers? # cd /usr/src # git clone git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git # cd qemu-kvm # ./configure --help|grep cpu-emulation --disable-cpu-emulation disables use of qemu cpu emulation code # git bisect reset master We are not bisecting. # git bisect good qemu-kvm-0.12.1.2 You need to start by git bisect start Do you want me to do it for you [Y/n]? y # git bisect bad qemu-kvm-0.12.2 Bisecting: 14 revisions left to test after this (roughly 4 steps) [66dbb62824845e91808171a675998706ce359c71] Handle TFTP ERROR from client # ./configure --help|grep cpu-emulation show nothing when bisecting... ... configure script is different :( That's fine - you'll be running upstream qemu instead of qemu-kvm. Just remember to add -enable-kvm to the command line. Use ./configure --target-list=x86_64-softtmmu to cut down on compile time. I'm betting 73b48d914f9 is the cause, but let's see the full bisect. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 2/4] x86: Add inject_realmode_interrupt() wrapper
On Sat, Aug 14, 2010 at 03:19:39AM +0300, Mohammed Gamal wrote: This adds a wrapper function inject_realmode_interrupt() around the emulator function emulate_int_real() to allow real mode interrupt injection. Signed-off-by: Mohammed Gamal m.gamal...@gmail.com --- arch/x86/kvm/x86.c | 33 + arch/x86/kvm/x86.h |1 + 2 files changed, 34 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1722d37..d3ba1c3 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3936,6 +3936,39 @@ static void inject_emulated_exception(struct kvm_vcpu *vcpu) kvm_queue_exception(vcpu, ctxt-exception); } +int inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq) +{ + struct decode_cache *c = vcpu-arch.emulate_ctxt.decode; + int cs_db, cs_l, ret; + cache_all_regs(vcpu); + + kvm_x86_ops-get_cs_db_l_bits(vcpu, cs_db, cs_l); + + vcpu-arch.emulate_ctxt.vcpu = vcpu; + vcpu-arch.emulate_ctxt.eflags = kvm_x86_ops-get_rflags(vcpu); + vcpu-arch.emulate_ctxt.eip = kvm_rip_read(vcpu); + vcpu-arch.emulate_ctxt.mode = + (!is_protmode(vcpu)) ? X86EMUL_MODE_REAL : + (vcpu-arch.emulate_ctxt.eflags X86_EFLAGS_VM) + ? X86EMUL_MODE_VM86 : cs_l + ? X86EMUL_MODE_PROT64 : cs_db + ? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16; + memset(c, 0, sizeof(struct decode_cache)); + memcpy(c-regs, vcpu-arch.regs, sizeof c-regs); + We have this code in 2 places already: kvm_task_switch() and emulate_instruction(). This will be the third one. Should be moved to separate function. + ret = emulate_int_real(vcpu-arch.emulate_ctxt, emulate_ops, irq); + + if (ret != X86EMUL_CONTINUE) + return EMULATE_FAIL; + + memcpy(vcpu-arch.regs, c-regs, sizeof c-regs); + kvm_rip_write(vcpu, vcpu-arch.emulate_ctxt.eip); + kvm_x86_ops-set_rflags(vcpu, vcpu-arch.emulate_ctxt.eflags); + + return EMULATE_DONE; +} +EXPORT_SYMBOL_GPL(inject_realmode_interrupt); + static int handle_emulation_failure(struct kvm_vcpu *vcpu) { ++vcpu-stat.insn_emulation_fail; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index b7a4047..c6e8a4d 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -67,5 +67,6 @@ static inline int is_paging(struct kvm_vcpu *vcpu) void kvm_before_handle_nmi(struct kvm_vcpu *vcpu); void kvm_after_handle_nmi(struct kvm_vcpu *vcpu); +int inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq); #endif -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: fix poison overwritten caused by using wrong xstate size
On 08/13/2010 10:19 AM, Xiaotian Feng wrote: fpu.state is allocated from task_xstate_cachep, the size of task_xstate_cachep is xstate_size. xstate_size is set from cpuid instruction, which is often smaller than sizeof(struct xsave_struct). kvm is using sizeof(struct xsave_struct) to fill in/out fpu.state.xsave, as what we allocated for fpu.state is xstate_size, kernel will write out of memory and caused poison/redzone/padding overwritten warnings. Thanks, applied. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] x86: Bail out on unemulated instructions
On Sun, Aug 15, 2010 at 06:58:06PM +0300, Avi Kivity wrote: On 08/15/2010 03:49 PM, Gleb Natapov wrote: True. Can we do it for real mode in general (i.e. X86EMUL_MODE_REAL)? If we flush all shadow pages when moving from paged mode to non paged checking for X86EMUL_MODE_REAL sounds enough to me, but Avi knows better. Or we can add is_big_real_mode() callback to x86_ops and implement it in vmx accordingly. Neither are possible. We can have one cpu in big real mode and others in paged mode, so even in real mode we cannot rule out a spurious page fault due to shadow write protection. Correct, just checking X86EMUL_MODE_REAL is not enough due to smp, but why checking for big real mode will not work? If instruction can't be emulated while vcpu is in big real mode returning to vcpu is not an option, so kvm will fail anyway. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 4/4] x86 emulator: Eagerly commit emulation ctxt eip in emulate_int_real()
On 08/14/2010 03:19 AM, Mohammed Gamal wrote: emulate_int_real() is to be used outside the emulator. Hence, we shouldn't wait for writeback to write the eip value stored in the decode cache. Save it in emulation context eagerly instead. Signed-off-by: Mohammed Gamalm.gamal...@gmail.com --- arch/x86/kvm/emulate.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c index 32498e3..ae45b04 100644 --- a/arch/x86/kvm/emulate.c +++ b/arch/x86/kvm/emulate.c @@ -1245,7 +1245,7 @@ int emulate_int_real(struct x86_emulate_ctxt *ctxt, if (rc != X86EMUL_CONTINUE) return rc; - c-eip = eip; + ctxt-eip = eip; return rc; } Doesn't seem right. It should work like the rest of the emulator. Instead, the wrapper code in x86.c should do this. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] x86: Bail out on unemulated instructions
On Sun, Aug 15, 2010 at 03:43:15PM +0300, Mohammed Gamal wrote: 2010/8/15 Gleb Natapov g...@redhat.com: On Sun, Aug 15, 2010 at 03:40:00PM +0300, Mohammed Gamal wrote: On Sun, Aug 15, 2010 at 10:32 AM, Gleb Natapov g...@redhat.com wrote: On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote: If emulation fails due to the instruction being unemulated. Return immediately instead of restarting the instruction and infinitely trying to execute it. This is already handled correctly as far as I can see. Sometimes instruction should be retried and reexecute_instruction() checks for that case. If instruction emulation fails in big real mode re-executing instruction will be useless though, so what should be done is to make reexecute_instruction() return false if vcpu is in big real mode and cpu relies on emulation to handle it. We don't have a separate mode for big real mode. The emulation modes we have are real and vm86 That doesn't makes the patch right. So we will have to figure something out. True. Can we do it for real mode in general (i.e. X86EMUL_MODE_REAL)? If we flush all shadow pages when moving from paged mode to non paged checking for X86EMUL_MODE_REAL sounds enough to me, but Avi knows better. Or we can add is_big_real_mode() callback to x86_ops and implement it in vmx accordingly. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH v2 0/4] Real mode interrupt injection
On 08/14/2010 03:09 AM, Mohammed Gamal wrote: This patch introduces real mode interrupt injection for VMX. It currently invokes the x86 emulator to emulate interrupts instead of manually setting VMX controls. Needless to say, this is not meant for merging in its current state. The emulator still needs some more work to get this completely operational. Mohammed Gamal (3): x86 emulator: Expose emulate_int_real() x86: Add inject_realmode_interrupt() wrapper VMX: Emulated real mode interrupt injection arch/x86/include/asm/kvm_emulate.h |3 ++- arch/x86/kvm/vmx.c | 11 +-- arch/x86/kvm/x86.c | 14 ++ arch/x86/kvm/x86.h |1 + 4 files changed, 18 insertions(+), 11 deletions(-) --- Changes since v1: - Save emulation context eip value early in emulate_int_real() - Properly initialize emulation context in inject_realmode_interrupt() - Implement error checks on using inject_realmode_interrupt() Do those changes help your tests? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: make mmu_shrink() fit shrinker's requirement
On 08/13/2010 11:10 PM, Dave Hansen wrote: On Thu, 2010-08-05 at 12:28 +0300, Avi Kivity wrote: On 08/04/2010 10:13 AM, Lai Jiangshan wrote: mmu_shrink() should attempt to free @nr_to_scan entries. This conflicts with Dave's patchset. Dave, what's going on with those patches? They're starting to smell. These seem to fix the original problem reporter's issue. They were run with 64 guests on a 32GB machine. No stability problems popped up in this testing, or since I last sent the patches to you. The results from both the test with only the first four patches and with the entire set of nine looked pretty identical. That tells me that we should only push the first four for now: abstract kvm x86 mmu-n_free_mmu_pages rename x86 kvm-arch.n_alloc_mmu_pages replace x86 kvm n_free_mmu_pages with n_used_mmu_pages create aggregate kvm_total_used_mmu_pages value Well, patches 3 and 4 have unaddressed review comments. Please fix them up. If you don't have the time, let me know and I'll do it instead. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] x86: Bail out on unemulated instructions
On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote: If emulation fails due to the instruction being unemulated. Return immediately instead of restarting the instruction and infinitely trying to execute it. This is already handled correctly as far as I can see. Sometimes instruction should be retried and reexecute_instruction() checks for that case. If instruction emulation fails in big real mode re-executing instruction will be useless though, so what should be done is to make reexecute_instruction() return false if vcpu is in big real mode and cpu relies on emulation to handle it. Signed-off-by: Mohammed Gamal m.gamal...@gmail.com --- arch/x86/kvm/x86.c |6 ++ 1 files changed, 6 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 416aa0e..a31db44 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4036,6 +4036,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu, } ++vcpu-stat.insn_emulation; + if (r == X86EMUL_UNHANDLEABLE) + return handle_emulation_failure(vcpu); + if (r) { if (reexecute_instruction(vcpu, cr2)) return EMULATE_DONE; @@ -4057,6 +4060,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu, restart: r = x86_emulate_insn(vcpu-arch.emulate_ctxt); + if (r == X86EMUL_UNHANDLEABLE) + return handle_emulation_failure(vcpu); + if (r) { /* emulation failed */ if (reexecute_instruction(vcpu, cr2)) return EMULATE_DONE; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: destroy workqueue on kvm_create_pit() failures
On 08/13/2010 11:23 AM, Xiaotian Feng wrote: kernel needs to destroy workqueue if kvm_create_pit() fails, otherwise after pit is freed, the workqueue is leaked. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Fix bug for vcpu hotplug
On 08/06/2010 06:36 AM, Liu, Jinsong wrote: Recently seabios implement vcpu hotplug infrastructure. During test, we found qemu-kvm has a bug result in guestos shutdown when vcpu hotadd. This patch is to fix the bug, mark bus-allow_hotplug as 1 after qdev_hotplug init done. Please copy qemu-devel on qemu patches. @@ -117,6 +117,9 @@ DeviceState *qdev_create(BusState *bus, const char *name) hw_error(Unknown device '%s' for bus '%s'\n, name, bus-info-name); } +if (qdev_hotplug) +bus-allow_hotplug = 1; + return qdev_create_from_info(bus, info); } Doesn't seem right - this will set allow_hotplug on all busses. It needs to be set only on the system bus (hw/sysbus.c). -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 0/3] Real mode interrupt injection
On Sun, Aug 15, 2010 at 3:23 PM, Avi Kivity a...@redhat.com wrote: On 08/12/2010 04:07 AM, Mohammed Gamal wrote: I was playing around with the non-atomic-injection branch. I decided to use e_i_g_s=1, and it's worth noting that I never experienced these faults with the switch enabled. Hate to spoil it. I did experience the faults again with e_i_g_s=1, although much less frequently. What is rather really strange, is that I could get a Linux guest to boot up completely both with e_i_g_s=1 and without it with the real mode interrupt patch enabled. It looks to me like the problem mainly happens when the BIOS tranfers control to the boot loader. Other guests usually fail. Would you like me to attach a trace? It will be much too big, upload it somewhere or send it to be privately. But, use the code with the interrupt injection setup fixed (see my comment to patch 2). Please take a look at my latest patch series. We can take the discussion there. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: system_powerdown not working for qemu-kvm 0.12.4?
That's fine - you'll be running upstream qemu instead of qemu-kvm. Just remember to add -enable-kvm to the command line. Use ./configure --target-list=x86_64-softtmmu to cut down on compile time. Yes, I am doing so about the --target-list but missed out the -enable-kvm command option prior to start each Guest OS. Here is my script: ---8--- #!/bin/sh kerver=`uname -r` KERNELDIR=--kerneldir=/usr/src/linux-`uname -r` make clean ./configure --prefix=/usr/local/kvm ${KERNELDIR} \ --target-list=x86_64-softmmu make make INSTALL_MOD_STRIP=1 install mkinitrd -v -f --builtin=ehci-hcd --builtin=uhci-hcd --builtin=ohci-hcd --builtin=usb-storage /boot/initrd-${kerver}.img ${kerver} depmod -v `uname -r` ---8--- Instead of rebooting... I do the following then start the FreeBSD Guest OS to test: 1. Stop all Guest OSes. 2. rmmod -v kvm-intel 3. rmmod -v kvm 4. modprobe -v kvm 5. modprobe -v kvm-intel If I add -enable-kvm to the command line to start the guest OS I got: No SMP KVM support, use '-smp 1' failed to initialize KVM Well... ... changed to use '-smp 1' as shown and the FreeBSD 8.1 Guest OS started without issues. However for CentOS 5.5 Guest OS, since I am using virtio... ... running in qemu instead of qemu-kvm will be an issue? I have problem to start the Guest OS and it always goes to the CentOS installer since I have the following: ---SNAP--- -boot cd \ -cdrom /home/kvm/images/CentOS-5.5-x86_64-bin-1of8.iso \ -drive file=/dev/VM/centos5,if=virtio,boot=on \ ---SNAP--- So I will not test on other Guest OS but just focus on FreeBSD instead. I'm betting 73b48d914f9 is the cause, but let's see the full bisect. I don't know whether I have done it right or not about bisecting but each good ones I just use git bisect good until the one that is failing then I use git bisect bad. The following are for git bisect good qemu-kvm-0.12.1.2 and git bisect bad qemu-kvm-0.12.2: # git bisect log git bisect start # good: [a6ec654a7a863afff41b491a02ffd696c862cb41] Merge branch 'stable-0.12-upstream' into stable-0.12 git bisect good a6ec654a7a863afff41b491a02ffd696c862cb41 # bad: [c01cfac552861ca4d82e359791a2d79da7f80cb5] device assignment: default requires IOMMU git bisect bad c01cfac552861ca4d82e359791a2d79da7f80cb5 # good: [66dbb62824845e91808171a675998706ce359c71] Handle TFTP ERROR from client git bisect good 66dbb62824845e91808171a675998706ce359c71 # good: [04babf6c6f8ccf69f1219db5fea233d679702e90] roms: rework rom loading via fw git bisect good 04babf6c6f8ccf69f1219db5fea233d679702e90 # good: [3999bf32440c1ea2ceb85eef008cc56a069af13f] Qemu's internal TFTP server breaks lock-step-iness of TFTP git bisect good 3999bf32440c1ea2ceb85eef008cc56a069af13f # good: [e389e937a7b94186449e0590bdc8f04ecbb1ab0b] Update version and changelog for release git bisect good e389e937a7b94186449e0590bdc8f04ecbb1ab0b # bad: [b874ce1db7d8654850c8a6606b95ffb1c7d22ce2] Merge remote branch 'upstream/stable-0.12' into stable-0.12 git bisect bad b874ce1db7d8654850c8a6606b95ffb1c7d22ce2 The following are git bisect log between git bisect good qemu-kvm-0.12.1.2 and git bisect bad qemu-kvm-0.12.3 instead: git bisect start # good: [a6ec654a7a863afff41b491a02ffd696c862cb41] Merge branch 'stable-0.12-upstream' into stable-0.12 git bisect good a6ec654a7a863afff41b491a02ffd696c862cb41 # bad: [69a5ecafa27daeb943dc2ee65b1470844f23f934] Merge branch 'stable-0.12-merge' into stable-0.12 git bisect bad 69a5ecafa27daeb943dc2ee65b1470844f23f934 # good: [d0d888bc6d1a106609b9af42ecb552c6c34a85c5] qcow2: Return 0/-errno in qcow2_alloc_cluster_offset git bisect good d0d888bc6d1a106609b9af42ecb552c6c34a85c5 # good: [7ebc79037c5f426bfb08cc506670bf7dd3912430] virtio-net: fix network stall under load git bisect good 7ebc79037c5f426bfb08cc506670bf7dd3912430 # good: [6173d56bdcb53389c54e803873e6bf8f87836a4f] Merge remote branch 'qemu-kvm/uq/stable-0.12' into stable-0.12 git bisect good 6173d56bdcb53389c54e803873e6bf8f87836a4f # bad: [59691c0cb129c9aa955be22573b43b26534f9db4] KVM: Request setting of nmi_pending and sipi_vector git bisect bad 59691c0cb129c9aa955be22573b43b26534f9db4 # bad: [dec2eb9d724b21581500aea911dd13f7bfbea59e] Fix kvm_load_mpstate for vcpu hot add git bisect bad dec2eb9d724b21581500aea911dd13f7bfbea59e Let me know if I have done anything wrong about the bisecting... ... Thanks. Kindest regards, Giam Teck Choon -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] test: add test for bsf/bsr instruction
On 08/09/2010 01:01 PM, Wei Yongjun wrote: This patch add test for bsf/bsr instruction. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: hot plug memory in guest
On 08/10/2010 05:53 PM, Gu, Zhongshu wrote: Hi all: I want to dynamically register memory into the linux guest during runtime. I will compile linux kernel with sparse memory model support. Does kvm support that kind of function? I am not sure how linux detect physical memory and how does memslot mapped to physical memory? Through mc146818? Memory hotplug is not yet supported. Have a look at the balloon driver for similar functionality. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: fix poison overwritten caused by using wrong xstate size
On 08/14/2010 12:03 AM, H. Peter Anvin wrote: Avi, do you want to take this one or should I? I will, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] x86: Bail out on unemulated instructions
On 08/15/2010 07:11 PM, Gleb Natapov wrote: Neither are possible. We can have one cpu in big real mode and others in paged mode, so even in real mode we cannot rule out a spurious page fault due to shadow write protection. Correct, just checking X86EMUL_MODE_REAL is not enough due to smp, but why checking for big real mode will not work? If instruction can't be emulated while vcpu is in big real mode returning to vcpu is not an option, so kvm will fail anyway. Right. I guess we can have an emulation_reason variable which explains why we are emulating (unvirtualizable state, mmu page fault, mmio page fault, unvirtualizable instruction) and decide accordingly. But it's a lot of work. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: system_powerdown not working for qemu-kvm 0.12.4?
On 08/15/2010 07:15 PM, Teck Choon Giam wrote: Let me know if I have done anything wrong about the bisecting... ... All looks fine, but what are the results? git should say something like 'x is first bad commit' which is the interesting part. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
MegaSAS 8708EM2 qemu-kvm.git tree updated to v0.12.5
Greetings Hannes, hch and Co, The lastest code from upstream qemu-kvm.git v0.12.5 has been merged into the megasas HBA emulation friendly qemu-kvm.git/master and scsi-bsg branches at: http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=summary The merge commitdiffs for master and scsi-bsg can be found here: http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=commitdiff;h=331578e7e362b33c965d469ea4577956dd431bbc http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=commitdiff;h=2eebcfd04adda1fee641a776c9a85dda95c43b43 The megasas HBA emulation has been given a quick test with scsi-generic and scsi-bsg backstores into TCM_Loop FILEIO LUNs with x86_64 v2.6.26 guests on a x86_64 v2.6.35 host. So far things appear to be functioning as expected with the megasas SGL passthrough logic and v0.12.5 upstream qemu-kvm code. There was also some new upstream code in hw/scsi-disk.c which does not seem to be required with the SGL passthrough logic, and that I ended up dropping for the v0.12.5 merge. The code that that was dropped during the merge starts with scsi_command_complete() at: http://git.kernel.org/?p=virt/kvm/qemu-kvm.git;a=blob;f=hw/scsi-disk.c;hb=HEAD#l101 and includes everything down to scsi_write_data(). Using scsi-disk was also given a quick test and appears to be functioning as expected using passthrough SGL logic and userspace QEMU SCSI CDB emulation. Note there does appear to be some breakage with the SGL passthrough and recent upstream changes with hw/lsi53c895a.c to use a local dma_buf pointer and to get rid of LSIState-select_dev. While I had verified that the SGL passthrough code was working with lsi53c895a on v0.12.4 with scsi-generic+scsi-bsg backstores, this is now segfaulting for me after the latest upstream merge. I will need to have another look at this, but if someone who has more knowledge with hw/lsi53c895a.c could help out, it would be much apperciated. ;) Comments are welcome! Best, --nab -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 2/2] RESEND: Disable build of ivshmem on non-KVM systems
Thanks, applied. On Sat, Aug 14, 2010 at 11:47 PM, Cam Macdonell c...@cs.ualberta.ca wrote: Signed-off-by: Cam Macdonell c...@cs.ualberta.ca --- Makefile.target | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/Makefile.target b/Makefile.target index b791492..c8281e9 100644 --- a/Makefile.target +++ b/Makefile.target @@ -191,7 +191,7 @@ obj-y += rtl8139.o obj-y += e1000.o # Inter-VM PCI shared memory -obj-y += ivshmem.o +obj-$(CONFIG_KVM) += ivshmem.o # Hardware support obj-i386-y += vga.o -- 1.6.2.5 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/7] AMD IOMMU emulation patches v3
Hi, Please have a look at these and merge if you wish. I hope I've addressed the issues people have raised. Some changes from the previous RFC: - included and updated the other two device patches - moved map registration and invalidation management into PCI code - AMD IOMMU emulation is always enabled (no more configure options) - cleaned up code, I now use typedefs as suggested - event logging cleanups BTW, the change to pci_regs.h is properly aligned but the original file contains tabs. Cheers, Eduard Eduard - Gabriel Munteanu (7): pci: add range_covers_range() pci: memory access API and IOMMU support AMD IOMMU emulation ide: use the PCI memory access interface rtl8139: use the PCI memory access interface eepro100: use the PCI memory access interface ac97: use the PCI memory access interface Makefile.target |2 + dma-helpers.c | 46 - dma.h | 21 ++- hw/ac97.c |6 +- hw/amd_iommu.c| 688 + hw/eepro100.c | 78 --- hw/ide/core.c | 15 +- hw/ide/internal.h | 39 +++ hw/ide/pci.c |7 + hw/pc.c |2 + hw/pci.c | 197 +++- hw/pci.h | 84 +++ hw/pci_ids.h |2 + hw/pci_regs.h |1 + hw/rtl8139.c | 99 + qemu-common.h |1 + 16 files changed, 1191 insertions(+), 97 deletions(-) create mode 100644 hw/amd_iommu.c -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/7] pci: add range_covers_range()
This helper function allows map invalidation code to determine which maps must be invalidated. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro --- hw/pci.h | 10 ++ 1 files changed, 10 insertions(+), 0 deletions(-) diff --git a/hw/pci.h b/hw/pci.h index 4bd8a1a..5a6cdb5 100644 --- a/hw/pci.h +++ b/hw/pci.h @@ -419,6 +419,16 @@ static inline int range_covers_byte(uint64_t offset, uint64_t len, return offset = byte byte = range_get_last(offset, len); } +/* Check whether a given range completely covers another. */ +static inline int range_covers_range(uint64_t first_big, uint64_t len_big, + uint64_t first_small, uint64_t len_small) +{ +uint64_t last_big = range_get_last(first_big, len_big); +uint64_t last_small = range_get_last(first_small, len_small); + +return first_big = first_small last_small = last_big; +} + /* Check whether 2 given ranges overlap. * Undefined if ranges that wrap around 0. */ static inline int ranges_overlap(uint64_t first1, uint64_t len1, -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/7] pci: memory access API and IOMMU support
PCI devices should access memory through pci_memory_*() instead of cpu_physical_memory_*(). This also provides support for translation and access checking in case an IOMMU is emulated. Memory maps are treated as remote IOTLBs (that is, translation caches belonging to the IOMMU-aware device itself). Clients (devices) must provide callbacks for map invalidation in case these maps are persistent beyond the current I/O context, e.g. AIO DMA transfers. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro --- hw/pci.c | 197 - hw/pci.h | 74 + qemu-common.h |1 + 3 files changed, 271 insertions(+), 1 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index 6871728..8668e06 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -58,6 +58,18 @@ struct PCIBus { Keep a count of the number of devices with raised IRQs. */ int nirq; int *irq_count; + +PCIDevice *iommu; +PCITranslateFunc*translate; +}; + +struct PCIMemoryMap { +pcibus_taddr; +pcibus_tlen; +target_phys_addr_t paddr; +PCIInvalidateMapFunc*invalidate; +void*invalidate_opaque; +QLIST_ENTRY(PCIMemoryMap) list; }; static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent); @@ -166,6 +178,19 @@ static void pci_device_reset(PCIDevice *dev) pci_update_mappings(dev); } +static int pci_no_translate(PCIDevice *iommu, +PCIDevice *dev, +pcibus_t addr, +target_phys_addr_t *paddr, +target_phys_addr_t *len, +unsigned perms) +{ +*paddr = addr; +*len = -1; + +return 0; +} + static void pci_bus_reset(void *opaque) { PCIBus *bus = opaque; @@ -227,7 +252,10 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent, const char *name, int devfn_min) { qbus_create_inplace(bus-qbus, pci_bus_info, parent, name); -bus-devfn_min = devfn_min; + +bus-devfn_min = devfn_min; +bus-iommu = NULL; +bus-translate = pci_no_translate; /* host bridge */ QLIST_INIT(bus-child); @@ -2029,6 +2057,173 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) } } +void pci_register_iommu(PCIDevice *iommu, +PCITranslateFunc *translate) +{ +iommu-bus-iommu = iommu; +iommu-bus-translate = translate; +} + +void pci_memory_rw(PCIDevice *dev, + pcibus_t addr, + uint8_t *buf, + pcibus_t len, + int is_write) +{ +int err; +unsigned perms; +PCIDevice *iommu = dev-bus-iommu; +target_phys_addr_t paddr, plen; + +perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ; + +while (len) { +err = dev-bus-translate(iommu, dev, addr, paddr, plen, perms); +if (err) +return; + +/* The translation might be valid for larger regions. */ +if (plen len) +plen = len; + +cpu_physical_memory_rw(paddr, buf, plen, is_write); + +len -= plen; +addr += plen; +buf += plen; +} +} + +static void pci_memory_register_map(PCIDevice *dev, +pcibus_t addr, +pcibus_t len, +target_phys_addr_t paddr, +PCIInvalidateMapFunc *invalidate, +void *invalidate_opaque) +{ +PCIMemoryMap *map; + +map = qemu_malloc(sizeof(PCIMemoryMap)); +map-addr = addr; +map-len= len; +map-paddr = paddr; +map-invalidate = invalidate; +map-invalidate_opaque = invalidate_opaque; + +QLIST_INSERT_HEAD(dev-memory_maps, map, list); +} + +static void pci_memory_unregister_map(PCIDevice *dev, + target_phys_addr_t paddr, + target_phys_addr_t len) +{ +PCIMemoryMap *map; + +QLIST_FOREACH(map, dev-memory_maps, list) { +if (map-paddr == paddr map-len == len) { +QLIST_REMOVE(map, list); +free(map); +} +} +} + +void pci_memory_invalidate_range(PCIDevice *dev, + pcibus_t addr, + pcibus_t len) +{ +PCIMemoryMap *map; + +QLIST_FOREACH(map, dev-memory_maps, list) { +if (range_covers_range(addr, len, map-addr, map-len)) { +map-invalidate(map-invalidate_opaque); +QLIST_REMOVE(map, list); +free(map); +} +} +} + +void *pci_memory_map(PCIDevice *dev, +
[PATCH 4/7] ide: use the PCI memory access interface
Emulated PCI IDE controllers now use the memory access interface. This also allows an emulated IOMMU to translate and check accesses. Map invalidation results in cancelling DMA transfers. Since the guest OS can't properly recover the DMA results in case the mapping is changed, this is a fairly good approximation. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro --- dma-helpers.c | 46 +- dma.h | 21 - hw/ide/core.c | 15 --- hw/ide/internal.h | 39 +++ hw/ide/pci.c |7 +++ 5 files changed, 115 insertions(+), 13 deletions(-) diff --git a/dma-helpers.c b/dma-helpers.c index d4fc077..9c3a21a 100644 --- a/dma-helpers.c +++ b/dma-helpers.c @@ -10,12 +10,36 @@ #include dma.h #include block_int.h -void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint) +static void *qemu_sglist_default_map(void *opaque, + QEMUSGInvalMapFunc *inval_cb, + void *inval_opaque, + target_phys_addr_t addr, + target_phys_addr_t *len, + int is_write) +{ +return cpu_physical_memory_map(addr, len, is_write); +} + +static void qemu_sglist_default_unmap(void *opaque, + void *buffer, + target_phys_addr_t len, + int is_write, + target_phys_addr_t access_len) +{ +cpu_physical_memory_unmap(buffer, len, is_write, access_len); +} + +void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, + QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap, void *opaque) { qsg-sg = qemu_malloc(alloc_hint * sizeof(ScatterGatherEntry)); qsg-nsg = 0; qsg-nalloc = alloc_hint; qsg-size = 0; + +qsg-map = map ? map : qemu_sglist_default_map; +qsg-unmap = unmap ? unmap : qemu_sglist_default_unmap; +qsg-opaque = opaque; } void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base, @@ -73,12 +97,23 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs) int i; for (i = 0; i dbs-iov.niov; ++i) { -cpu_physical_memory_unmap(dbs-iov.iov[i].iov_base, - dbs-iov.iov[i].iov_len, !dbs-is_write, - dbs-iov.iov[i].iov_len); +dbs-sg-unmap(dbs-sg-opaque, + dbs-iov.iov[i].iov_base, + dbs-iov.iov[i].iov_len, !dbs-is_write, + dbs-iov.iov[i].iov_len); } } +static void dma_bdrv_cancel(void *opaque) +{ +DMAAIOCB *dbs = opaque; + +bdrv_aio_cancel(dbs-acb); +dma_bdrv_unmap(dbs); +qemu_iovec_destroy(dbs-iov); +qemu_aio_release(dbs); +} + static void dma_bdrv_cb(void *opaque, int ret) { DMAAIOCB *dbs = (DMAAIOCB *)opaque; @@ -100,7 +135,8 @@ static void dma_bdrv_cb(void *opaque, int ret) while (dbs-sg_cur_index dbs-sg-nsg) { cur_addr = dbs-sg-sg[dbs-sg_cur_index].base + dbs-sg_cur_byte; cur_len = dbs-sg-sg[dbs-sg_cur_index].len - dbs-sg_cur_byte; -mem = cpu_physical_memory_map(cur_addr, cur_len, !dbs-is_write); +mem = dbs-sg-map(dbs-sg-opaque, dma_bdrv_cancel, dbs, + cur_addr, cur_len, !dbs-is_write); if (!mem) break; qemu_iovec_add(dbs-iov, mem, cur_len); diff --git a/dma.h b/dma.h index f3bb275..d48f35c 100644 --- a/dma.h +++ b/dma.h @@ -15,6 +15,19 @@ #include hw/hw.h #include block.h +typedef void QEMUSGInvalMapFunc(void *opaque); +typedef void *QEMUSGMapFunc(void *opaque, +QEMUSGInvalMapFunc *inval_cb, +void *inval_opaque, +target_phys_addr_t addr, +target_phys_addr_t *len, +int is_write); +typedef void QEMUSGUnmapFunc(void *opaque, + void *buffer, + target_phys_addr_t len, + int is_write, + target_phys_addr_t access_len); + typedef struct { target_phys_addr_t base; target_phys_addr_t len; @@ -25,9 +38,15 @@ typedef struct { int nsg; int nalloc; target_phys_addr_t size; + +QEMUSGMapFunc *map; +QEMUSGUnmapFunc *unmap; +void *opaque; } QEMUSGList; -void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint); +void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, + QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap, + void *opaque); void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base, target_phys_addr_t len); void qemu_sglist_destroy(QEMUSGList *qsg); diff --git a/hw/ide/core.c b/hw/ide/core.c index
[PATCH 3/7] AMD IOMMU emulation
This introduces emulation for the AMD IOMMU, described in AMD I/O Virtualization Technology (IOMMU) Specification. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro --- Makefile.target |2 + hw/amd_iommu.c | 688 +++ hw/pc.c |2 + hw/pci_ids.h|2 + hw/pci_regs.h |1 + 5 files changed, 695 insertions(+), 0 deletions(-) create mode 100644 hw/amd_iommu.c diff --git a/Makefile.target b/Makefile.target index 70a9c1b..6b80a37 100644 --- a/Makefile.target +++ b/Makefile.target @@ -219,6 +219,8 @@ obj-i386-y += pcspk.o i8254.o obj-i386-$(CONFIG_KVM_PIT) += i8254-kvm.o obj-i386-$(CONFIG_KVM_DEVICE_ASSIGNMENT) += device-assignment.o +obj-i386-y += amd_iommu.o + # Hardware support obj-ia64-y += ide.o pckbd.o vga.o $(SOUND_HW) dma.o $(AUDIODRV) obj-ia64-y += fdc.o mc146818rtc.o serial.o i8259.o ipf.o diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c new file mode 100644 index 000..2e20888 --- /dev/null +++ b/hw/amd_iommu.c @@ -0,0 +1,688 @@ +/* + * AMD IOMMU emulation + * + * Copyright (c) 2010 Eduard - Gabriel Munteanu + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the Software), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include pc.h +#include hw.h +#include pci.h +#include qlist.h + +/* Capability registers */ +#define CAPAB_HEADER0x00 +#define CAPAB_REV_TYPE0x02 +#define CAPAB_FLAGS 0x03 +#define CAPAB_BAR_LOW 0x04 +#define CAPAB_BAR_HIGH 0x08 +#define CAPAB_RANGE 0x0C +#define CAPAB_MISC 0x10 + +#define CAPAB_SIZE 0x14 + +/* Capability header data */ +#define CAPAB_FLAG_IOTLBSUP (1 0) +#define CAPAB_FLAG_HTTUNNEL (1 1) +#define CAPAB_FLAG_NPCACHE (1 2) +#define CAPAB_INIT_REV (1 3) +#define CAPAB_INIT_TYPE 3 +#define CAPAB_INIT_REV_TYPE (CAPAB_REV | CAPAB_TYPE) +#define CAPAB_INIT_FLAGS(CAPAB_FLAG_NPCACHE | CAPAB_FLAG_HTTUNNEL) +#define CAPAB_INIT_MISC (64 15) | (48 8) +#define CAPAB_BAR_MASK ~((1UL 14) - 1) + +/* MMIO registers */ +#define MMIO_DEVICE_TABLE 0x +#define MMIO_COMMAND_BASE 0x0008 +#define MMIO_EVENT_BASE 0x0010 +#define MMIO_CONTROL0x0018 +#define MMIO_EXCL_BASE 0x0020 +#define MMIO_EXCL_LIMIT 0x0028 +#define MMIO_COMMAND_HEAD 0x2000 +#define MMIO_COMMAND_TAIL 0x2008 +#define MMIO_EVENT_HEAD 0x2010 +#define MMIO_EVENT_TAIL 0x2018 +#define MMIO_STATUS 0x2020 + +#define MMIO_SIZE 0x4000 + +#define MMIO_DEVTAB_SIZE_MASK ((1ULL 12) - 1) +#define MMIO_DEVTAB_BASE_MASK (((1ULL 52) - 1) ~MMIO_DEVTAB_SIZE_MASK) +#define MMIO_DEVTAB_ENTRY_SIZE 32 +#define MMIO_DEVTAB_SIZE_UNIT 4096 + +#define MMIO_CMDBUF_SIZE_BYTE (MMIO_COMMAND_BASE + 7) +#define MMIO_CMDBUF_SIZE_MASK 0x0F +#define MMIO_CMDBUF_BASE_MASK MMIO_DEVTAB_BASE_MASK +#define MMIO_CMDBUF_DEFAULT_SIZE8 +#define MMIO_CMDBUF_HEAD_MASK (((1ULL 19) - 1) ~0x0F) +#define MMIO_CMDBUF_TAIL_MASK MMIO_EVTLOG_HEAD_MASK + +#define MMIO_EVTLOG_SIZE_BYTE (MMIO_EVENT_BASE + 7) +#define MMIO_EVTLOG_SIZE_MASK MMIO_CMDBUF_SIZE_MASK +#define MMIO_EVTLOG_BASE_MASK MMIO_CMDBUF_BASE_MASK +#define MMIO_EVTLOG_DEFAULT_SIZEMMIO_CMDBUF_DEFAULT_SIZE +#define MMIO_EVTLOG_HEAD_MASK (((1ULL 19) - 1) ~0x0F) +#define MMIO_EVTLOG_TAIL_MASK MMIO_EVTLOG_HEAD_MASK + +#define MMIO_EXCL_BASE_MASK MMIO_DEVTAB_BASE_MASK +#define MMIO_EXCL_ENABLED_MASK (1ULL 0) +#define MMIO_EXCL_ALLOW_MASK(1ULL 1) +#define MMIO_EXCL_LIMIT_MASKMMIO_DEVTAB_BASE_MASK +#define MMIO_EXCL_LIMIT_LOW 0xFFF + +#define MMIO_CONTROL_IOMMUEN(1ULL 0) +#define MMIO_CONTROL_HTTUNEN(1ULL 1) +#define MMIO_CONTROL_EVENTLOGEN (1ULL 2) +#define MMIO_CONTROL_EVENTINTEN (1ULL 3) +#define MMIO_CONTROL_COMWAITINTEN (1ULL 4) +#define
[PATCH 6/7] eepro100: use the PCI memory access interface
This allows the device to work properly with an emulated IOMMU. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro --- hw/eepro100.c | 78 ++--- 1 files changed, 41 insertions(+), 37 deletions(-) diff --git a/hw/eepro100.c b/hw/eepro100.c index 97afa2c..6e23271 100644 --- a/hw/eepro100.c +++ b/hw/eepro100.c @@ -306,10 +306,10 @@ static const uint16_t eepro100_mdi_mask[] = { }; /* XXX: optimize */ -static void stl_le_phys(target_phys_addr_t addr, uint32_t val) +static void stl_le_phys(EEPRO100State * s, pcibus_t addr, uint32_t val) { val = cpu_to_le32(val); -cpu_physical_memory_write(addr, (const uint8_t *)val, sizeof(val)); +pci_memory_write(s-dev, addr, (const uint8_t *)val, sizeof(val)); } #define POLYNOMIAL 0x04c11db6 @@ -692,12 +692,12 @@ static void dump_statistics(EEPRO100State * s) * values which really matter. * Number of data should check configuration!!! */ -cpu_physical_memory_write(s-statsaddr, - (uint8_t *) s-statistics, s-stats_size); -stl_le_phys(s-statsaddr + 0, s-statistics.tx_good_frames); -stl_le_phys(s-statsaddr + 36, s-statistics.rx_good_frames); -stl_le_phys(s-statsaddr + 48, s-statistics.rx_resource_errors); -stl_le_phys(s-statsaddr + 60, s-statistics.rx_short_frame_errors); +pci_memory_write(s-dev, s-statsaddr, + (uint8_t *) s-statistics, s-stats_size); +stl_le_phys(s, s-statsaddr + 0, s-statistics.tx_good_frames); +stl_le_phys(s, s-statsaddr + 36, s-statistics.rx_good_frames); +stl_le_phys(s, s-statsaddr + 48, s-statistics.rx_resource_errors); +stl_le_phys(s, s-statsaddr + 60, s-statistics.rx_short_frame_errors); #if 0 stw_le_phys(s-statsaddr + 76, s-statistics.xmt_tco_frames); stw_le_phys(s-statsaddr + 78, s-statistics.rcv_tco_frames); @@ -707,7 +707,8 @@ static void dump_statistics(EEPRO100State * s) static void read_cb(EEPRO100State *s) { -cpu_physical_memory_read(s-cb_address, (uint8_t *) s-tx, sizeof(s-tx)); +pci_memory_read(s-dev, +s-cb_address, (uint8_t *) s-tx, sizeof(s-tx)); s-tx.status = le16_to_cpu(s-tx.status); s-tx.command = le16_to_cpu(s-tx.command); s-tx.link = le32_to_cpu(s-tx.link); @@ -737,18 +738,18 @@ static void tx_command(EEPRO100State *s) } assert(tcb_bytes = sizeof(buf)); while (size tcb_bytes) { -uint32_t tx_buffer_address = ldl_phys(tbd_address); -uint16_t tx_buffer_size = lduw_phys(tbd_address + 4); +uint32_t tx_buffer_address = pci_ldl(s-dev, tbd_address); +uint16_t tx_buffer_size = pci_lduw(s-dev, tbd_address + 4); #if 0 -uint16_t tx_buffer_el = lduw_phys(tbd_address + 6); +uint16_t tx_buffer_el = pci_lduw(s-dev, tbd_address + 6); #endif tbd_address += 8; TRACE(RXTX, logout (TBD (simplified mode): buffer address 0x%08x, size 0x%04x\n, tx_buffer_address, tx_buffer_size)); tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size); -cpu_physical_memory_read(tx_buffer_address, buf[size], - tx_buffer_size); +pci_memory_read(s-dev, +tx_buffer_address, buf[size], tx_buffer_size); size += tx_buffer_size; } if (tbd_array == 0x) { @@ -759,16 +760,16 @@ static void tx_command(EEPRO100State *s) if (s-has_extended_tcb_support !(s-configuration[6] BIT(4))) { /* Extended Flexible TCB. */ for (; tbd_count 2; tbd_count++) { -uint32_t tx_buffer_address = ldl_phys(tbd_address); -uint16_t tx_buffer_size = lduw_phys(tbd_address + 4); -uint16_t tx_buffer_el = lduw_phys(tbd_address + 6); +uint32_t tx_buffer_address = pci_ldl(s-dev, tbd_address); +uint16_t tx_buffer_size = pci_lduw(s-dev, tbd_address + 4); +uint16_t tx_buffer_el = pci_lduw(s-dev, tbd_address + 6); tbd_address += 8; TRACE(RXTX, logout (TBD (extended flexible mode): buffer address 0x%08x, size 0x%04x\n, tx_buffer_address, tx_buffer_size)); tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size); -cpu_physical_memory_read(tx_buffer_address, buf[size], - tx_buffer_size); +pci_memory_read(s-dev, +tx_buffer_address, buf[size], tx_buffer_size); size += tx_buffer_size; if (tx_buffer_el 1) { break; @@ -777,16 +778,16 @@ static void tx_command(EEPRO100State *s) } tbd_address = tbd_array; for (; tbd_count s-tx.tbd_count; tbd_count++) { -uint32_t tx_buffer_address = ldl_phys(tbd_address); -uint16_t tx_buffer_size =
[PATCH 7/7] ac97: use the PCI memory access interface
This allows the device to work properly with an emulated IOMMU. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro --- hw/ac97.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/ac97.c b/hw/ac97.c index 4319bc8..9ee4894 100644 --- a/hw/ac97.c +++ b/hw/ac97.c @@ -223,7 +223,7 @@ static void fetch_bd (AC97LinkState *s, AC97BusMasterRegs *r) { uint8_t b[8]; -cpu_physical_memory_read (r-bdbar + r-civ * 8, b, 8); +pci_memory_read (s-dev, r-bdbar + r-civ * 8, b, 8); r-bd_valid = 1; r-bd.addr = le32_to_cpu (*(uint32_t *) b[0]) ~3; r-bd.ctl_len = le32_to_cpu (*(uint32_t *) b[4]); @@ -972,7 +972,7 @@ static int write_audio (AC97LinkState *s, AC97BusMasterRegs *r, while (temp) { int copied; to_copy = audio_MIN (temp, sizeof (tmpbuf)); -cpu_physical_memory_read (addr, tmpbuf, to_copy); +pci_memory_read (s-dev, addr, tmpbuf, to_copy); copied = AUD_write (s-voice_po, tmpbuf, to_copy); dolog (write_audio max=%x to_copy=%x copied=%x\n, max, to_copy, copied); @@ -1056,7 +1056,7 @@ static int read_audio (AC97LinkState *s, AC97BusMasterRegs *r, *stop = 1; break; } -cpu_physical_memory_write (addr, tmpbuf, acquired); +pci_memory_write (s-dev, addr, tmpbuf, acquired); temp -= acquired; addr += acquired; nread += acquired; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Split region allocation code from pci_bios_init_device()
pci_bios_alloc() can be used to allocate space in the PCI region for other purposes. This is needed by the AMD IOMMU support code. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro --- src/pciinit.c | 17 + 1 files changed, 13 insertions(+), 4 deletions(-) diff --git a/src/pciinit.c b/src/pciinit.c index 0556ee2..bfc669f 100644 --- a/src/pciinit.c +++ b/src/pciinit.c @@ -75,6 +75,16 @@ static void pci_bios_init_bridges(u16 bdf) } } +static inline u32 pci_bios_alloc(u32 *region, u32 size) +{ +u32 ret; + +ret = ALIGN(*region, size); +*region = ret + size; + +return ret; +} + static void pci_bios_init_device(u16 bdf) { int class; @@ -146,14 +156,13 @@ static void pci_bios_init_device(u16 bdf) pci_config_writel(bdf, ofs, old); if (val != 0) { -u32 size = (~(val mask)) + 1; +u32 base, size = (~(val mask)) + 1; if (val PCI_BASE_ADDRESS_SPACE_IO) paddr = pci_bios_io_addr; else paddr = pci_bios_mem_addr; -*paddr = ALIGN(*paddr, size); -pci_set_io_region_addr(bdf, i, *paddr); -*paddr += size; +base = pci_bios_alloc(paddr, size); +pci_set_io_region_addr(bdf, i, base); } } break; -- 1.7.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] AMD IOMMU support
This initializes the AMD IOMMU and creates ACPI tables for it. Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro --- Makefile |2 +- src/acpi.c | 79 src/iommu.c| 64 + src/iommu.h| 12 src/pci.h |4 +++ src/pci_ids.h |1 + src/pci_regs.h |1 + src/pciinit.c | 11 8 files changed, 173 insertions(+), 1 deletions(-) create mode 100644 src/iommu.c create mode 100644 src/iommu.h diff --git a/Makefile b/Makefile index fe0c1ce..98f253d 100644 --- a/Makefile +++ b/Makefile @@ -14,7 +14,7 @@ OUT=out/ SRCBOTH=misc.c pmm.c stacks.c output.c util.c block.c floppy.c ata.c mouse.c \ kbd.c pci.c serial.c clock.c pic.c cdrom.c ps2port.c smp.c resume.c \ pnpbios.c pirtable.c vgahooks.c ramdisk.c pcibios.c blockcmd.c \ -usb.c usb-uhci.c usb-ohci.c usb-ehci.c usb-hid.c usb-msc.c +usb.c usb-uhci.c usb-ohci.c usb-ehci.c usb-hid.c usb-msc.c iommu.c SRC16=$(SRCBOTH) system.c disk.c apm.c font.c SRC32FLAT=$(SRCBOTH) post.c shadow.c memmap.c coreboot.c boot.c \ acpi.c smm.c mptable.c smbios.c pciinit.c optionroms.c mtrr.c \ diff --git a/src/acpi.c b/src/acpi.c index 0559443..7ea9c55 100644 --- a/src/acpi.c +++ b/src/acpi.c @@ -6,6 +6,7 @@ // This file may be distributed under the terms of the GNU LGPLv3 license. #include acpi.h // struct rsdp_descriptor +#include iommu.h #include util.h // memcpy #include pci.h // pci_find_device #include biosvar.h // GET_EBDA @@ -268,6 +269,36 @@ struct srat_memory_affinity u32reserved3[2]; } PACKED; +/* + * IVRS (I/O Virtualization Reporting Structure) table. + * + * Describes the AMD IOMMU, as per: + * AMD I/O Virtualization Technology (IOMMU) Specification, rev 1.26 + */ + +struct ivrs_ivhd +{ +u8type; +u8flags; +u16 length; +u16 devid; +u16 capab_off; +u32 iommu_base_low; +u32 iommu_base_high; +u16 pci_seg_group; +u16 iommu_info; +u32 reserved; +u8entry[0]; +} PACKED; + +struct ivrs_table +{ +ACPI_TABLE_HEADER_DEF/* ACPI common table header. */ +u32iv_info; +u32reserved[2]; +struct ivrs_ivhd ivhd; +} PACKED; + #include acpi-dsdt.hex static inline u16 cpu_to_le16(u16 x) @@ -599,6 +630,53 @@ build_srat(void) return srat; } +#define IVRS_SIGNATURE 0x53525649 // IVRS +#define IVRS_MAX_DEVS 32 +static void * +build_ivrs(void) +{ +int iommu_bdf, bdf, max, i; +struct ivrs_table *ivrs; +struct ivrs_ivhd *ivhd; + +iommu_bdf = pci_find_class(PCI_CLASS_SYSTEM_IOMMU); +if (iommu_bdf 0) +return NULL; + +ivrs = malloc_high(sizeof(struct ivrs_table) + 4 * IVRS_MAX_DEVS); +ivrs-iv_info = iommu_get_misc() ~0x000F; + +ivhd = ivrs-ivhd; +ivhd-type = 0x10; +ivhd-flags = 0; +ivhd-length= sizeof(struct ivrs_ivhd); +ivhd-devid = iommu_get_bdf(); +ivhd-capab_off = iommu_get_cap_offset(); +ivhd-iommu_base_low= iommu_get_base(); +ivhd-iommu_base_high = 0; +ivhd-pci_seg_group = 0; +ivhd-iommu_info= 0; +ivhd-reserved = 0; + +i = 0; +foreachpci(bdf, max) { +if (bdf == ivhd-devid) +continue; +ivhd-entry[4 * i + 0] = 2; +ivhd-entry[4 * i + 1] = bdf 0xFF; +ivhd-entry[4 * i + 2] = (bdf 8) 0xFF; +ivhd-entry[4 * i + 3] = ~(1 3); +ivhd-length += 4; +if (++i = IVRS_MAX_DEVS) +break; +} + +build_header((void *) ivrs, IVRS_SIGNATURE, + sizeof(struct ivrs_table) + 4 * i, 1); + +return ivrs; +} + struct rsdp_descriptor *RsdpAddr; #define MAX_ACPI_TABLES 20 @@ -639,6 +717,7 @@ acpi_bios_init(void) ACPI_INIT_TABLE(build_madt()); ACPI_INIT_TABLE(build_hpet()); ACPI_INIT_TABLE(build_srat()); +ACPI_INIT_TABLE(build_ivrs()); u16 i, external_tables = qemu_cfg_acpi_additional_tables(); diff --git a/src/iommu.c b/src/iommu.c new file mode 100644 index 000..97af24a --- /dev/null +++ b/src/iommu.c @@ -0,0 +1,64 @@ +// AMD IOMMU initialization code. +// +// Copyright (C) 2010 Eduard - Gabriel Munteanu eduard.munte...@linux360.ro +// +// This file may be distributed under the terms of the GNU LGPLv3 license. + +#include iommu.h +#include pci.h +#include types.h + +#define IOMMU_CAP_BAR_LOW 0x04 +#define IOMMU_CAP_BAR_HIGH 0x08 +#define IOMMU_CAP_RANGE 0x0C +#define IOMMU_CAP_MISC 0x10 + +static int iommu_bdf = -1; +static u8 iommu_cap_offset; +static u32 iommu_base; + +void iommu_init(int bdf, u32 base) +{ +u8 ptr, cap, type; + +/* Only one IOMMU is supported. */ +if (iommu_bdf = 0) +return; + +foreachcap(bdf, ptr, cap) { +type = pci_config_readb(bdf, cap); +if (type ==
Re: [Qemu-devel] [PATCH 7/7] ac97: use the PCI memory access interface
On Sun, 15 Aug 2010, Eduard - Gabriel Munteanu wrote: This allows the device to work properly with an emulated IOMMU. Fine with me. [..snip..] -- mailto:av1...@comtv.ru -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Relationship between libkvm and qemu-kvm.c
Hello, everyone, I am a little bit confusing with the qemu-kvm project in which I found some similar code in both libkvm and qemu-kvm.c. Is the libkvm really used by qemu? What's the relationship between them? Best regards, -- Hao Shen -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v3 0/4] Real mode interrupt injection
This patch introduces real mode interrupt injection for VMX. It currently invokes the x86 emulator to emulate interrupts instead of manually setting VMX controls. Needless to say, this is not meant for merging in its current state. The emulator still needs some more work to get this completely operational. Mohammed Gamal (4): x86 emulator: Expose emulate_int_real() x86: Separate emulation context initialization in a separate function x86: Add kvm_inject_realmode_interrupt() wrapper VMX: Emulated real mode interrupt injection arch/x86/include/asm/kvm_emulate.h |3 +- arch/x86/kvm/vmx.c | 65 +++ arch/x86/kvm/x86.c | 75 ++-- arch/x86/kvm/x86.h |1 + 4 files changed, 55 insertions(+), 89 deletions(-) --- Changes since v2: - Refactored emulation context initialization code - Commit eip value from the decode cache to the emulation context in x86.c rather than the emulator - Add kvm_* prefix to inject_realmode_interrupt() global symbol for consistency -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v3 1/4] x86 emulator: Expose emulate_int_real()
Signed-off-by: Mohammed Gamal m.gamal...@gmail.com --- arch/x86/include/asm/kvm_emulate.h |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/arch/x86/include/asm/kvm_emulate.h b/arch/x86/include/asm/kvm_emulate.h index f22e5da..6a7cce0 100644 --- a/arch/x86/include/asm/kvm_emulate.h +++ b/arch/x86/include/asm/kvm_emulate.h @@ -255,5 +255,6 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt); int emulator_task_switch(struct x86_emulate_ctxt *ctxt, u16 tss_selector, int reason, bool has_error_code, u32 error_code); - +int emulate_int_real(struct x86_emulate_ctxt *ctxt, +struct x86_emulate_ops *ops, int irq); #endif /* _ASM_X86_KVM_X86_EMULATE_H */ -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v3 2/4] x86: Separate emulation context initialization in a separate function
The code for initializing the emulation context is duplicated at two locations (emulate_instruction() and kvm_task_switch()). Separate it in a separate function and call it from there. Signed-off-by: Mohammed Gamal m.gamal...@gmail.com --- arch/x86/kvm/x86.c | 54 --- 1 files changed, 25 insertions(+), 29 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 1722d37..f24e594 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3936,6 +3936,28 @@ static void inject_emulated_exception(struct kvm_vcpu *vcpu) kvm_queue_exception(vcpu, ctxt-exception); } +static void init_emulate_ctxt(struct kvm_vcpu *vcpu) +{ + struct decode_cache *c = vcpu-arch.emulate_ctxt.decode; + int cs_db, cs_l; + + cache_all_regs(vcpu); + + kvm_x86_ops-get_cs_db_l_bits(vcpu, cs_db, cs_l); + + vcpu-arch.emulate_ctxt.vcpu = vcpu; + vcpu-arch.emulate_ctxt.eflags = kvm_x86_ops-get_rflags(vcpu); + vcpu-arch.emulate_ctxt.eip = kvm_rip_read(vcpu); + vcpu-arch.emulate_ctxt.mode = + (!is_protmode(vcpu)) ? X86EMUL_MODE_REAL : + (vcpu-arch.emulate_ctxt.eflags X86_EFLAGS_VM) + ? X86EMUL_MODE_VM86 : cs_l + ? X86EMUL_MODE_PROT64 : cs_db + ? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16; + memset(c, 0, sizeof(struct decode_cache)); + memcpy(c-regs, vcpu-arch.regs, sizeof c-regs); +} + static int handle_emulation_failure(struct kvm_vcpu *vcpu) { ++vcpu-stat.insn_emulation_fail; @@ -3992,20 +4014,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu, cache_all_regs(vcpu); if (!(emulation_type EMULTYPE_NO_DECODE)) { - int cs_db, cs_l; - kvm_x86_ops-get_cs_db_l_bits(vcpu, cs_db, cs_l); - - vcpu-arch.emulate_ctxt.vcpu = vcpu; - vcpu-arch.emulate_ctxt.eflags = kvm_x86_ops-get_rflags(vcpu); - vcpu-arch.emulate_ctxt.eip = kvm_rip_read(vcpu); - vcpu-arch.emulate_ctxt.mode = - (!is_protmode(vcpu)) ? X86EMUL_MODE_REAL : - (vcpu-arch.emulate_ctxt.eflags X86_EFLAGS_VM) - ? X86EMUL_MODE_VM86 : cs_l - ? X86EMUL_MODE_PROT64 : cs_db - ? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16; - memset(c, 0, sizeof(struct decode_cache)); - memcpy(c-regs, vcpu-arch.regs, sizeof c-regs); + init_emulate_ctxt(vcpu); vcpu-arch.emulate_ctxt.interruptibility = 0; vcpu-arch.emulate_ctxt.exception = -1; vcpu-arch.emulate_ctxt.perm_ok = false; @@ -5064,22 +5073,9 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int reason, bool has_error_code, u32 error_code) { struct decode_cache *c = vcpu-arch.emulate_ctxt.decode; - int cs_db, cs_l, ret; - cache_all_regs(vcpu); - - kvm_x86_ops-get_cs_db_l_bits(vcpu, cs_db, cs_l); + int ret; - vcpu-arch.emulate_ctxt.vcpu = vcpu; - vcpu-arch.emulate_ctxt.eflags = kvm_x86_ops-get_rflags(vcpu); - vcpu-arch.emulate_ctxt.eip = kvm_rip_read(vcpu); - vcpu-arch.emulate_ctxt.mode = - (!is_protmode(vcpu)) ? X86EMUL_MODE_REAL : - (vcpu-arch.emulate_ctxt.eflags X86_EFLAGS_VM) - ? X86EMUL_MODE_VM86 : cs_l - ? X86EMUL_MODE_PROT64 : cs_db - ? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16; - memset(c, 0, sizeof(struct decode_cache)); - memcpy(c-regs, vcpu-arch.regs, sizeof c-regs); + init_emulate_ctxt(vcpu); ret = emulator_task_switch(vcpu-arch.emulate_ctxt, tss_selector, reason, has_error_code, -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v3 3/4] x86: Add kvm_inject_realmode_interrupt() wrapper
This adds a wrapper function kvm_inject_realmode_interrupt() around the emulator function emulate_int_real() to allow real mode interrupt injection. Signed-off-by: Mohammed Gamal m.gamal...@gmail.com --- arch/x86/kvm/x86.c | 21 + arch/x86/kvm/x86.h |1 + 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index f24e594..59b708c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3958,6 +3958,27 @@ static void init_emulate_ctxt(struct kvm_vcpu *vcpu) memcpy(c-regs, vcpu-arch.regs, sizeof c-regs); } +int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq) +{ + struct decode_cache *c = vcpu-arch.emulate_ctxt.decode; + int ret; + + init_emulate_ctxt(vcpu); + + ret = emulate_int_real(vcpu-arch.emulate_ctxt, emulate_ops, irq); + + if (ret != X86EMUL_CONTINUE) + return EMULATE_FAIL; + + vcpu-arch.emulate_ctxt.eip = c-eip; + memcpy(vcpu-arch.regs, c-regs, sizeof c-regs); + kvm_rip_write(vcpu, vcpu-arch.emulate_ctxt.eip); + kvm_x86_ops-set_rflags(vcpu, vcpu-arch.emulate_ctxt.eflags); + + return EMULATE_DONE; +} +EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt); + static int handle_emulation_failure(struct kvm_vcpu *vcpu) { ++vcpu-stat.insn_emulation_fail; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index b7a4047..8b83da5 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -67,5 +67,6 @@ static inline int is_paging(struct kvm_vcpu *vcpu) void kvm_before_handle_nmi(struct kvm_vcpu *vcpu); void kvm_after_handle_nmi(struct kvm_vcpu *vcpu); +int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq); #endif -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC PATCH v3 4/4] VMX: Emulated real mode interrupt injection
Signed-off-by: Mohammed Gamal m.gamal...@gmail.com --- arch/x86/kvm/vmx.c | 65 --- 1 files changed, 6 insertions(+), 59 deletions(-) diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index 652d317..0f9e3e4 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -155,11 +155,6 @@ struct vcpu_vmx { u32 limit; u32 ar; } tr, es, ds, fs, gs; - struct { - bool pending; - u8 vector; - unsigned rip; - } irq; } rmode; int vpid; bool emulation_required; @@ -1048,16 +1043,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr, } if (vmx-rmode.vm86_active) { - vmx-rmode.irq.pending = true; - vmx-rmode.irq.vector = nr; - vmx-rmode.irq.rip = kvm_rip_read(vcpu); - if (kvm_exception_is_soft(nr)) - vmx-rmode.irq.rip += - vmx-vcpu.arch.event_exit_inst_len; - intr_info |= INTR_TYPE_SOFT_INTR; - vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info); - vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, 1); - kvm_rip_write(vcpu, vmx-rmode.irq.rip - 1); + if (kvm_inject_realmode_interrupt(vcpu, nr) != EMULATE_DONE) + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); return; } @@ -2838,16 +2825,8 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu) ++vcpu-stat.irq_injections; if (vmx-rmode.vm86_active) { - vmx-rmode.irq.pending = true; - vmx-rmode.irq.vector = irq; - vmx-rmode.irq.rip = kvm_rip_read(vcpu); - if (vcpu-arch.interrupt.soft) - vmx-rmode.irq.rip += - vmx-vcpu.arch.event_exit_inst_len; - vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, -irq | INTR_TYPE_SOFT_INTR | INTR_INFO_VALID_MASK); - vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, 1); - kvm_rip_write(vcpu, vmx-rmode.irq.rip - 1); + if (kvm_inject_realmode_interrupt(vcpu, irq) != EMULATE_DONE) + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); return; } intr = irq | INTR_INFO_VALID_MASK; @@ -2879,14 +2858,8 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu) ++vcpu-stat.nmi_injections; if (vmx-rmode.vm86_active) { - vmx-rmode.irq.pending = true; - vmx-rmode.irq.vector = NMI_VECTOR; - vmx-rmode.irq.rip = kvm_rip_read(vcpu); - vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, -NMI_VECTOR | INTR_TYPE_SOFT_INTR | -INTR_INFO_VALID_MASK); - vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, 1); - kvm_rip_write(vcpu, vmx-rmode.irq.rip - 1); + if (kvm_inject_realmode_interrupt(vcpu, NMI_VECTOR) != EMULATE_DONE) + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu); return; } vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, @@ -3848,29 +3821,6 @@ static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx) ktime_to_ns(ktime_sub(ktime_get(), vmx-entry_time)); } -/* - * Failure to inject an interrupt should give us the information - * in IDT_VECTORING_INFO_FIELD. However, if the failure occurs - * when fetching the interrupt redirection bitmap in the real-mode - * tss, this doesn't happen. So we do it ourselves. - */ -static void fixup_rmode_irq(struct vcpu_vmx *vmx, u32 *idt_vectoring_info) -{ - vmx-rmode.irq.pending = 0; - if (kvm_rip_read(vmx-vcpu) + 1 != vmx-rmode.irq.rip) - return; - kvm_rip_write(vmx-vcpu, vmx-rmode.irq.rip); - if (*idt_vectoring_info VECTORING_INFO_VALID_MASK) { - *idt_vectoring_info = ~VECTORING_INFO_TYPE_MASK; - *idt_vectoring_info |= INTR_TYPE_EXT_INTR; - return; - } - *idt_vectoring_info = - VECTORING_INFO_VALID_MASK - | INTR_TYPE_EXT_INTR - | vmx-rmode.irq.vector; -} - static void __vmx_complete_interrupts(struct vcpu_vmx *vmx, u32 idt_vectoring_info, int instr_len_field, @@ -3880,9 +3830,6 @@ static void __vmx_complete_interrupts(struct vcpu_vmx *vmx, int type; bool idtv_info_valid; - if (vmx-rmode.irq.pending) - fixup_rmode_irq(vmx, idt_vectoring_info); - idtv_info_valid = idt_vectoring_info VECTORING_INFO_VALID_MASK; vmx-vcpu.arch.nmi_injected = false; -- 1.7.0.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to
Re: [RFC PATCH v3 0/4] Real mode interrupt injection
On Mon, Aug 16, 2010 at 12:46 AM, Mohammed Gamal m.gamal...@gmail.com wrote: This patch introduces real mode interrupt injection for VMX. It currently invokes the x86 emulator to emulate interrupts instead of manually setting VMX controls. Needless to say, this is not meant for merging in its current state. The emulator still needs some more work to get this completely operational. Mohammed Gamal (4): x86 emulator: Expose emulate_int_real() x86: Separate emulation context initialization in a separate function x86: Add kvm_inject_realmode_interrupt() wrapper VMX: Emulated real mode interrupt injection arch/x86/include/asm/kvm_emulate.h | 3 +- arch/x86/kvm/vmx.c | 65 +++ arch/x86/kvm/x86.c | 75 ++-- arch/x86/kvm/x86.h | 1 + 4 files changed, 55 insertions(+), 89 deletions(-) --- Changes since v2: - Refactored emulation context initialization code - Commit eip value from the decode cache to the emulation context in x86.c rather than the emulator - Add kvm_* prefix to inject_realmode_interrupt() global symbol for consistency Here is a full trace of a MINIX guest since bootup. Looks like we get stuck somewhere in the BIOS. https://docs.google.com/leaf?id=0B9UodZT1IuENMzJhNWQxM2YtYzE3YS00YWY4LTk2YTgtZWY3ODNhMWUxMDkxsort=namelayout=listnum=50 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KSM with Debian GNU/Linux
Hi, all! On Thursday, 12 August 2010 22:05:34 -0300, Daniel Bareiro wrote: Keeping the kernel I had compiled and installing the qemu-kvm package in Backports, now KSM is working: # cat /sys/kernel/mm/ksm/pages_sharing 181406 Looking at the statistics of the values obtained running 15 virtual machines totaling 10.7 GB on a 4 GB VMHost, I get the following, which is a very interesting memory savings: # for ii in /sys/kernel/mm/ksm/* ; do echo -n $ii: ; cat $ii ; done /sys/kernel/mm/ksm/full_scans: 4114 /sys/kernel/mm/ksm/max_kernel_pages: 253500 /sys/kernel/mm/ksm/pages_shared: 67064 /sys/kernel/mm/ksm/pages_sharing: 510990 /sys/kernel/mm/ksm/pages_to_scan: 100 /sys/kernel/mm/ksm/pages_unshared: 448079 /sys/kernel/mm/ksm/pages_volatile: 13595 /sys/kernel/mm/ksm/run: 1 /sys/kernel/mm/ksm/sleep_millisecs: 20 # free total used free sharedbuffers cached Mem: 405646825787281477740 0 3736 62156 -/+ buffers/cache:25128361543632 Swap: 497848 25972 471876 Some recommendation about tunning of KSM? I've no very clear about the difference between page_shared and page_sharing. Somebody could clarify it? Thanks for your reply. Regards, Daniel -- Fingerprint: BFB3 08D6 B4D1 31B2 72B9 29CE 6696 BF1B 14E6 1D37 Powered by Debian GNU/Linux Lenny - Linux user #188.598 signature.asc Description: Digital signature
RE: [qemu-kvm] build fail on i386 RHEL5u4
Avi Kivity wrote: On 08/11/2010 04:49 AM, Hao, Xudong wrote: Hi, Recently I build qemu-kvm on 32bit RHEL5u4/RHEL5u5, it will fail on fuction vhost_dev_sync_region. But RHEL5u1 system is fine to build. Did anyone meet similar issue? qemu-kvm commit: 59d71ddb432db04b57ee2658ce50a3e35d7db97e build error: ... CCx86_64-softmmu/i8254.o CCx86_64-softmmu/i8254-kvm.o CCx86_64-softmmu/device-assignment.o LINK x86_64-softmmu/qemu-system-x86_64 vhost.o: In function `vhost_dev_sync_region': /home/source/qemu-kvm/hw/vhost.c:47: undefined reference to `__sync_fetch_and_and_4' collect2: ld returned 1 exit status make[1]: *** [qemu-system-x86_64] Error 1 make: *** [subdir-x86_64-softmmu] Error 2 Appears to be a gcc bug. I opened https://bugzilla.redhat.com/show_bug.cgi?id=624279 to track this. Meanwhile, installing the gcc44 package and building with it (./configure --cc=gcc44) appears to work. Avi, Gcc44 works for me. I saw Jakub marked this bug closed with only i486 support that, but RHEL5 use -march=i386, so do we have ongoing fix on qemu-kvm? Thanks, Xudong-- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v6 3/3] KVM: MMU: prefetch ptes when intercepted guest #PF
Hi Marcelo, Thanks for your review and sorry for the delay reply. Marcelo Tosatti wrote: +static struct kvm_memory_slot * +pte_prefetch_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn, bool no_dirty_log) +{ +struct kvm_memory_slot *slot; + +slot = gfn_to_memslot(vcpu-kvm, gfn); +if (!slot || slot-flags KVM_MEMSLOT_INVALID || + (no_dirty_log slot-dirty_bitmap)) +slot = NULL; Why is this no_dirty_log optimization worthwhile? We disable prefetch the writable pages since 'pte prefetch' will hurt slot's dirty page tracking that it set the dirty_bitmap bit but the corresponding page is not really accessed. + +return slot; +} + +static pfn_t pte_prefetch_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, + bool no_dirty_log) +{ +struct kvm_memory_slot *slot; +unsigned long hva; + +slot = pte_prefetch_gfn_to_memslot(vcpu, gfn, no_dirty_log); +if (!slot) { +get_page(bad_page); +return page_to_pfn(bad_page); +} + +hva = gfn_to_hva_memslot(slot, gfn); + +return hva_to_pfn_atomic(vcpu-kvm, hva); +} + +static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu, +struct kvm_mmu_page *sp, +u64 *start, u64 *end) +{ +struct page *pages[PTE_PREFETCH_NUM]; +struct kvm_memory_slot *slot; +unsigned hva, access = sp-role.access; +int i, ret, npages = end - start; +gfn_t gfn; + +gfn = kvm_mmu_page_get_gfn(sp, start - sp-spt); +slot = pte_prefetch_gfn_to_memslot(vcpu, gfn, access ACC_WRITE_MASK); +if (!slot || slot-npages - (gfn - slot-base_gfn) != npages) +return -1; + +hva = gfn_to_hva_memslot(slot, gfn); +ret = __get_user_pages_fast(hva, npages, 1, pages); +if (ret = 0) +return -1; Better do one at a time with hva_to_pfn_atomic. Or, if you measure that its worthwhile, do on a separate patch (using a helper as discussed previously). Since it should disable 'prefetch' for the writable pages, so i'm not put these operations into a common function and define it in kvm_main.c file. Maybe we do better do these in a wrap function named pte_prefetch_gfn_to_pages()? @@ -302,14 +303,87 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp, static bool FNAME(gpte_changed)(struct kvm_vcpu *vcpu, struct guest_walker *gw, int level) { -int r; pt_element_t curr_pte; - -r = kvm_read_guest_atomic(vcpu-kvm, gw-pte_gpa[level - 1], +gpa_t base_gpa, pte_gpa = gw-pte_gpa[level - 1]; +u64 mask; +int r, index; + +if (level == PT_PAGE_TABLE_LEVEL) { +mask = PTE_PREFETCH_NUM * sizeof(pt_element_t) - 1; +base_gpa = pte_gpa ~mask; +index = (pte_gpa - base_gpa) / sizeof(pt_element_t); + +r = kvm_read_guest_atomic(vcpu-kvm, base_gpa, +gw-prefetch_ptes, sizeof(gw-prefetch_ptes)); +curr_pte = gw-prefetch_ptes[index]; This can slowdown a single non-prefetchable pte fault. Maybe its irrelevant, but please have kvm_read_guest_atomic in the first patch and then later optimize, its easier to review and bisectable. OK, i'll separate it. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/7] AMD IOMMU emulation patches v3
On 08/15/2010 02:27 PM, Eduard - Gabriel Munteanu wrote: Hi, Please have a look at these and merge if you wish. I hope I've addressed the issues people have raised. It's looking pretty good so far. I'm very happy with the modifications to the PCI layer. It looks like given the helpers that you've added, converting the PCI devices is more or less programmatic. IOW, it just requires an appropriate sed. I'd rather see an all-at-once conversion of the PCI devices than just convert over a couple functions. In fact, we can go a step further after that and start poisoning symbols to prevent the wrong interfaces from being used. Regards, Anthony Liguori Some changes from the previous RFC: - included and updated the other two device patches - moved map registration and invalidation management into PCI code - AMD IOMMU emulation is always enabled (no more configure options) - cleaned up code, I now use typedefs as suggested - event logging cleanups BTW, the change to pci_regs.h is properly aligned but the original file contains tabs. Cheers, Eduard Eduard - Gabriel Munteanu (7): pci: add range_covers_range() pci: memory access API and IOMMU support AMD IOMMU emulation ide: use the PCI memory access interface rtl8139: use the PCI memory access interface eepro100: use the PCI memory access interface ac97: use the PCI memory access interface Makefile.target |2 + dma-helpers.c | 46 - dma.h | 21 ++- hw/ac97.c |6 +- hw/amd_iommu.c| 688 + hw/eepro100.c | 78 --- hw/ide/core.c | 15 +- hw/ide/internal.h | 39 +++ hw/ide/pci.c |7 + hw/pc.c |2 + hw/pci.c | 197 +++- hw/pci.h | 84 +++ hw/pci_ids.h |2 + hw/pci_regs.h |1 + hw/rtl8139.c | 99 + qemu-common.h |1 + 16 files changed, 1191 insertions(+), 97 deletions(-) create mode 100644 hw/amd_iommu.c -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/2] KVM: x86 emulator: put register operand write back to a function
On 08/12/2010 04:38 PM, Wei Yongjun wrote: Introduce function write_register_operand() to write back the register operand. +static void write_register_operand(struct operand *op, unsigned long val, + unsigned int bytes) +{ +/* The 4-byte case *is* correct: in 64-bit mode we zero-extend. */ +switch (bytes) { +case 1: +*(u8 *)op-addr.reg = (u8)val; +break; +case 2: +*(u16 *)op-addr.reg = (u16)val; +break; +case 4: +*op-addr.reg = (u32)val; +break; /* 64b: zero-extend */ +case 8: +*op-addr.reg = val; +break; +} +} It's cleaner to take val and bytes from struct operand, and do the assignment from the callers, no? take val and bytes from struct operand may have other issue, when we writeback the source register, we need do the assignment from the caller, and then change the val back before write src val to dst val. Such as xadd: c-src.val = c-dst.val; write_register_operand(c-src); c-src.val = c-src.orig_val; goto add; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kvm: fix poison overwritten caused by using wrong xstate size
Feel free to add my ack. Avi Kivity a...@redhat.com wrote: On 08/14/2010 12:03 AM, H. Peter Anvin wrote: Avi, do you want to take this one or should I? I will, thanks. -- error compiling committee.c: too many arguments to function -- Sent from my mobile phone. Please pardon any lack of formatting. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html