Re: [Qemu-devel] [PATCH 1/2] RESEND: Add kvm_set_ioeventfd_mmio_long definition for non-KVM systems

2010-08-15 Thread Blue Swirl
Thanks, applied.

On Sat, Aug 14, 2010 at 11:47 PM, Cam Macdonell c...@cs.ualberta.ca wrote:
 Signed-off-by: Cam Macdonell c...@cs.ualberta.ca
 ---
  kvm-stub.c |    5 +
  1 files changed, 5 insertions(+), 0 deletions(-)

 diff --git a/kvm-stub.c b/kvm-stub.c
 index 3378bd3..d45f9fa 100644
 --- a/kvm-stub.c
 +++ b/kvm-stub.c
 @@ -136,3 +136,8 @@ int kvm_set_ioeventfd_pio_word(int fd, uint16_t addr, 
 uint16_t val, bool assign)
  {
     return -ENOSYS;
  }
 +
 +int kvm_set_ioeventfd_mmio_long(int fd, uint32_t adr, uint32_t val, bool 
 assign)
 +{
 +    return -ENOSYS;
 +}
 --
 1.6.2.5



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86: Bail out on unemulated instructions

2010-08-15 Thread Mohammed Gamal
2010/8/15 Gleb Natapov g...@redhat.com:
 On Sun, Aug 15, 2010 at 03:40:00PM +0300, Mohammed Gamal wrote:
 On Sun, Aug 15, 2010 at 10:32 AM, Gleb Natapov g...@redhat.com wrote:
  On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote:
  If emulation fails due to the instruction being unemulated. Return 
  immediately
  instead of restarting the instruction and infinitely trying to execute it.
 
  This is already handled correctly as far as I can see. Sometimes
  instruction should be retried and reexecute_instruction() checks
  for that case. If instruction emulation fails in big real mode
  re-executing instruction will be useless though, so what should be done
  is to make reexecute_instruction() return false if vcpu is in big real
  mode and cpu relies on emulation to handle it.
 We don't have a separate mode for big real mode. The emulation modes
 we have are real and vm86

 That doesn't makes the patch right. So we will have to figure something
 out.
True. Can we do it for real mode in general (i.e. X86EMUL_MODE_REAL)?

 
  Signed-off-by: Mohammed Gamal m.gamal...@gmail.com
  ---
   arch/x86/kvm/x86.c |    6 ++
   1 files changed, 6 insertions(+), 0 deletions(-)
 
  diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
  index 416aa0e..a31db44 100644
  --- a/arch/x86/kvm/x86.c
  +++ b/arch/x86/kvm/x86.c
  @@ -4036,6 +4036,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
                }
 
                ++vcpu-stat.insn_emulation;
  +             if (r == X86EMUL_UNHANDLEABLE)
  +                     return handle_emulation_failure(vcpu);
  +
                if (r)  {
                        if (reexecute_instruction(vcpu, cr2))
                                return EMULATE_DONE;
  @@ -4057,6 +4060,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
   restart:
        r = x86_emulate_insn(vcpu-arch.emulate_ctxt);
 
  +     if (r == X86EMUL_UNHANDLEABLE)
  +             return handle_emulation_failure(vcpu);
  +
        if (r) { /* emulation failed */
                if (reexecute_instruction(vcpu, cr2))
                        return EMULATE_DONE;
  --
  1.7.0.4
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
  --
                         Gleb.
 

 --
                        Gleb.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [qemu-kvm] build fail on i386 RHEL5u4

2010-08-15 Thread Avi Kivity

 On 08/11/2010 04:49 AM, Hao, Xudong wrote:

Hi,
Recently I build qemu-kvm on 32bit RHEL5u4/RHEL5u5, it will fail on fuction 
vhost_dev_sync_region. But RHEL5u1 system is fine to build.
Did anyone meet similar issue?

qemu-kvm commit: 59d71ddb432db04b57ee2658ce50a3e35d7db97e

build error:
...
   CCx86_64-softmmu/i8254.o
   CCx86_64-softmmu/i8254-kvm.o
   CCx86_64-softmmu/device-assignment.o
   LINK  x86_64-softmmu/qemu-system-x86_64
vhost.o: In function `vhost_dev_sync_region':
/home/source/qemu-kvm/hw/vhost.c:47: undefined reference to 
`__sync_fetch_and_and_4'
collect2: ld returned 1 exit status
make[1]: *** [qemu-system-x86_64] Error 1
make: *** [subdir-x86_64-softmmu] Error 2



Appears to be a gcc bug.  I opened 
https://bugzilla.redhat.com/show_bug.cgi?id=624279 to track this.


Meanwhile, installing the gcc44 package and building with it 
(./configure --cc=gcc44) appears to work.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86: Bail out on unemulated instructions

2010-08-15 Thread Mohammed Gamal
On Sun, Aug 15, 2010 at 10:32 AM, Gleb Natapov g...@redhat.com wrote:
 On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote:
 If emulation fails due to the instruction being unemulated. Return 
 immediately
 instead of restarting the instruction and infinitely trying to execute it.

 This is already handled correctly as far as I can see. Sometimes
 instruction should be retried and reexecute_instruction() checks
 for that case. If instruction emulation fails in big real mode
 re-executing instruction will be useless though, so what should be done
 is to make reexecute_instruction() return false if vcpu is in big real
 mode and cpu relies on emulation to handle it.
We don't have a separate mode for big real mode. The emulation modes
we have are real and vm86


 Signed-off-by: Mohammed Gamal m.gamal...@gmail.com
 ---
  arch/x86/kvm/x86.c |    6 ++
  1 files changed, 6 insertions(+), 0 deletions(-)

 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 416aa0e..a31db44 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -4036,6 +4036,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
               }

               ++vcpu-stat.insn_emulation;
 +             if (r == X86EMUL_UNHANDLEABLE)
 +                     return handle_emulation_failure(vcpu);
 +
               if (r)  {
                       if (reexecute_instruction(vcpu, cr2))
                               return EMULATE_DONE;
 @@ -4057,6 +4060,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
  restart:
       r = x86_emulate_insn(vcpu-arch.emulate_ctxt);

 +     if (r == X86EMUL_UNHANDLEABLE)
 +             return handle_emulation_failure(vcpu);
 +
       if (r) { /* emulation failed */
               if (reexecute_instruction(vcpu, cr2))
                       return EMULATE_DONE;
 --
 1.7.0.4

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

 --
                        Gleb.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/3] Real mode interrupt injection

2010-08-15 Thread Avi Kivity

 On 08/12/2010 04:07 AM, Mohammed Gamal wrote:


I was playing around with the non-atomic-injection branch. I decided
to use e_i_g_s=1, and it's worth noting that I never experienced these
faults with the switch enabled.

Hate to spoil it. I did experience the faults again with e_i_g_s=1,
although much less frequently.

What is rather really strange, is that I could get a Linux guest to
boot up completely both with e_i_g_s=1 and without it with the real
mode interrupt patch enabled. It looks to me like the problem mainly
happens when the BIOS tranfers control to the boot loader. Other
guests usually fail.

Would you like me to attach a trace?



It will be much too big, upload it somewhere or send it to be privately.

But, use the code with the interrupt injection setup fixed (see my 
comment to patch 2).



--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 0/4] Real mode interrupt injection

2010-08-15 Thread Mohammed Gamal
On Sun, Aug 15, 2010 at 3:43 PM, Avi Kivity a...@redhat.com wrote:
  On 08/14/2010 03:09 AM, Mohammed Gamal wrote:

 This patch introduces real mode interrupt injection for VMX.
 It currently invokes the x86 emulator to emulate interrupts
 instead of manually setting VMX controls.

 Needless to say, this is not meant for merging in its current state.
 The emulator still needs some more work to get this completely
 operational.

 Mohammed Gamal (3):
   x86 emulator: Expose emulate_int_real()
   x86: Add inject_realmode_interrupt() wrapper
   VMX: Emulated real mode interrupt injection

  arch/x86/include/asm/kvm_emulate.h |    3 ++-
  arch/x86/kvm/vmx.c                 |   11 +--
  arch/x86/kvm/x86.c                 |   14 ++
  arch/x86/kvm/x86.h                 |    1 +
  4 files changed, 18 insertions(+), 11 deletions(-)

 ---
 Changes since v1:
 - Save emulation context eip value early in emulate_int_real()
 - Properly initialize emulation context in inject_realmode_interrupt()
 - Implement error checks on using inject_realmode_interrupt()

 Do those changes help your tests?

To an extent. At least now the BIOS mostly runs smoothly since eip
values are updated correctly. However, it looks like guests go into
nowhere once things are handed over to the boot loader. So there is
still many things we need to fix. I'll post a trace shortly.

 --
 error compiling committee.c: too many arguments to function


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86: Bail out on unemulated instructions

2010-08-15 Thread Avi Kivity

 On 08/15/2010 03:43 PM, Mohammed Gamal wrote:

2010/8/15 Gleb Natapovg...@redhat.com:

On Sun, Aug 15, 2010 at 03:40:00PM +0300, Mohammed Gamal wrote:

On Sun, Aug 15, 2010 at 10:32 AM, Gleb Natapovg...@redhat.com  wrote:

On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote:

If emulation fails due to the instruction being unemulated. Return immediately
instead of restarting the instruction and infinitely trying to execute it.


This is already handled correctly as far as I can see. Sometimes
instruction should be retried and reexecute_instruction() checks
for that case. If instruction emulation fails in big real mode
re-executing instruction will be useless though, so what should be done
is to make reexecute_instruction() return false if vcpu is in big real
mode and cpu relies on emulation to handle it.

We don't have a separate mode for big real mode. The emulation modes
we have are real and vm86


That doesn't makes the patch right. So we will have to figure something
out.

True. Can we do it for real mode in general (i.e. X86EMUL_MODE_REAL)?


We can do it conditionally for CPL=0.  That includes real mode (and 
excludes vm86).


However, there's a race involved (see a895e576cfd96).  I don't see how 
we can call handle_emulation_failure() without opening the race again.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86: Bail out on unemulated instructions

2010-08-15 Thread Avi Kivity

 On 08/15/2010 03:49 PM, Gleb Natapov wrote:


True. Can we do it for real mode in general (i.e. X86EMUL_MODE_REAL)?

If we flush all shadow pages when moving from paged mode to non paged
checking for X86EMUL_MODE_REAL sounds enough to me, but Avi knows better.
Or we can add is_big_real_mode() callback to x86_ops and implement it in
vmx accordingly.


Neither are possible.  We can have one cpu in big real mode and others 
in paged mode, so even in real mode we cannot rule out a spurious page 
fault due to shadow write protection.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 2/4] x86: Add inject_realmode_interrupt() wrapper

2010-08-15 Thread Avi Kivity

 On 08/14/2010 03:19 AM, Mohammed Gamal wrote:

This adds a wrapper function inject_realmode_interrupt() around the
emulator function emulate_int_real() to allow real mode interrupt injection.

+EXPORT_SYMBOL_GPL(inject_realmode_interrupt);
+


Global symbols should start with kvm_.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Freebsd image from Qemu hangs on booting in KVM

2010-08-15 Thread Avi Kivity

 On 08/05/2010 11:51 PM, Anjali Kulkarni wrote:

Thanks Avi,
I am trying to reproduce this on another setup, and do not see the issue.
My understanding is to run KVM + Qemu, I only need to install KVM modules,
and Qemu does not need to be modified. Is that correct?
I see

[r...@ipg-virt01 anjali]# lsmod | grep kvm
kvm_intel  87016  0
kvm   211496  1 kvm_intel

[r...@ipg-virt01 anjali]# modprobe -l 'kvm*'
/lib/modules/2.6.18-164.el5/kernel/extra/kvm.ko
/lib/modules/2.6.18-164.el5/kernel/extra/kvm-amd.ko
/lib/modules/2.6.18-164.el5/kernel/extra/kvm-intel.ko

And then I run Qemu, as I did, before installing KVM. It should just use
KVM? Is there any way I can check?


'info kvm' from the monitor


Btw, when it hangs, I cannot even press any key, so not sure how I can get
those commands you suggest below..


alt-ctrl-2

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: x86 emulator: put register operand write back to a function

2010-08-15 Thread Avi Kivity
 On 08/12/2010 04:38 PM, Wei Yongjun wrote:
 Introduce function write_register_operand() to write back the
 register operand.


  
 +static void write_register_operand(struct operand *op, unsigned long val,
 +unsigned int bytes)
 +{
 + /* The 4-byte case *is* correct: in 64-bit mode we zero-extend. */
 + switch (bytes) {
 + case 1:
 + *(u8 *)op-addr.reg = (u8)val;
 + break;
 + case 2:
 + *(u16 *)op-addr.reg = (u16)val;
 + break;
 + case 4:
 + *op-addr.reg = (u32)val;
 + break;  /* 64b: zero-extend */
 + case 8:
 + *op-addr.reg = val;
 + break;
 + }
 +}

It's cleaner to take val and bytes from struct operand, and do the
assignment from the callers, no?

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86: Bail out on unemulated instructions

2010-08-15 Thread Gleb Natapov
On Sun, Aug 15, 2010 at 03:40:00PM +0300, Mohammed Gamal wrote:
 On Sun, Aug 15, 2010 at 10:32 AM, Gleb Natapov g...@redhat.com wrote:
  On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote:
  If emulation fails due to the instruction being unemulated. Return 
  immediately
  instead of restarting the instruction and infinitely trying to execute it.
 
  This is already handled correctly as far as I can see. Sometimes
  instruction should be retried and reexecute_instruction() checks
  for that case. If instruction emulation fails in big real mode
  re-executing instruction will be useless though, so what should be done
  is to make reexecute_instruction() return false if vcpu is in big real
  mode and cpu relies on emulation to handle it.
 We don't have a separate mode for big real mode. The emulation modes
 we have are real and vm86
 
That doesn't makes the patch right. So we will have to figure something
out.

 
  Signed-off-by: Mohammed Gamal m.gamal...@gmail.com
  ---
   arch/x86/kvm/x86.c |    6 ++
   1 files changed, 6 insertions(+), 0 deletions(-)
 
  diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
  index 416aa0e..a31db44 100644
  --- a/arch/x86/kvm/x86.c
  +++ b/arch/x86/kvm/x86.c
  @@ -4036,6 +4036,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
                }
 
                ++vcpu-stat.insn_emulation;
  +             if (r == X86EMUL_UNHANDLEABLE)
  +                     return handle_emulation_failure(vcpu);
  +
                if (r)  {
                        if (reexecute_instruction(vcpu, cr2))
                                return EMULATE_DONE;
  @@ -4057,6 +4060,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
   restart:
        r = x86_emulate_insn(vcpu-arch.emulate_ctxt);
 
  +     if (r == X86EMUL_UNHANDLEABLE)
  +             return handle_emulation_failure(vcpu);
  +
        if (r) { /* emulation failed */
                if (reexecute_instruction(vcpu, cr2))
                        return EMULATE_DONE;
  --
  1.7.0.4
 
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
  --
                         Gleb.
 

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: system_powerdown not working for qemu-kvm 0.12.4?

2010-08-15 Thread Avi Kivity

 On 08/15/2010 02:32 AM, Teck Choon Giam wrote:

Can you try to bisect between qemu-kvm-0.12.3 and 0.12.4 to see which commit
introduced the regression?


Actually I am not so sure about how to do the bisecting as the below
steps always produce different configure for me.  Any pointers?

# cd /usr/src
# git clone git://git.kernel.org/pub/scm/virt/kvm/qemu-kvm.git
# cd qemu-kvm
# ./configure --help|grep cpu-emulation
   --disable-cpu-emulation  disables use of qemu cpu emulation code
# git bisect reset master
We are not bisecting.
# git bisect good qemu-kvm-0.12.1.2
You need to start by git bisect start
Do you want me to do it for you [Y/n]? y
# git bisect bad qemu-kvm-0.12.2
Bisecting: 14 revisions left to test after this (roughly 4 steps)
[66dbb62824845e91808171a675998706ce359c71] Handle TFTP ERROR from client
# ./configure --help|grep cpu-emulation
show nothing when bisecting... ... configure script is different :(



That's fine - you'll be running upstream qemu instead of qemu-kvm.  Just 
remember to add -enable-kvm to the command line.  Use ./configure 
--target-list=x86_64-softtmmu to cut down on compile time.


I'm betting 73b48d914f9 is the cause, but let's see the full bisect.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 2/4] x86: Add inject_realmode_interrupt() wrapper

2010-08-15 Thread Gleb Natapov
On Sat, Aug 14, 2010 at 03:19:39AM +0300, Mohammed Gamal wrote:
 This adds a wrapper function inject_realmode_interrupt() around the
 emulator function emulate_int_real() to allow real mode interrupt injection.
 
 Signed-off-by: Mohammed Gamal m.gamal...@gmail.com
 ---
  arch/x86/kvm/x86.c |   33 +
  arch/x86/kvm/x86.h |1 +
  2 files changed, 34 insertions(+), 0 deletions(-)
 
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 1722d37..d3ba1c3 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -3936,6 +3936,39 @@ static void inject_emulated_exception(struct kvm_vcpu 
 *vcpu)
   kvm_queue_exception(vcpu, ctxt-exception);
  }
  
 +int inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq)
 +{   
 + struct decode_cache *c = vcpu-arch.emulate_ctxt.decode;
 + int cs_db, cs_l, ret;
 + cache_all_regs(vcpu);
 +
 + kvm_x86_ops-get_cs_db_l_bits(vcpu, cs_db, cs_l);
 +
 + vcpu-arch.emulate_ctxt.vcpu = vcpu;
 + vcpu-arch.emulate_ctxt.eflags = kvm_x86_ops-get_rflags(vcpu);
 + vcpu-arch.emulate_ctxt.eip = kvm_rip_read(vcpu);
 + vcpu-arch.emulate_ctxt.mode =
 + (!is_protmode(vcpu)) ? X86EMUL_MODE_REAL :
 + (vcpu-arch.emulate_ctxt.eflags  X86_EFLAGS_VM)
 + ? X86EMUL_MODE_VM86 : cs_l
 + ? X86EMUL_MODE_PROT64 : cs_db
 + ? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16;
 + memset(c, 0, sizeof(struct decode_cache));
 + memcpy(c-regs, vcpu-arch.regs, sizeof c-regs);
 +
We have this code in 2 places already: kvm_task_switch() and 
emulate_instruction().
This will be the third one. Should be moved to separate function.

 + ret = emulate_int_real(vcpu-arch.emulate_ctxt, emulate_ops, irq);
 +
 + if (ret != X86EMUL_CONTINUE)
 + return EMULATE_FAIL;
 +
 + memcpy(vcpu-arch.regs, c-regs, sizeof c-regs);
 + kvm_rip_write(vcpu, vcpu-arch.emulate_ctxt.eip);
 + kvm_x86_ops-set_rflags(vcpu, vcpu-arch.emulate_ctxt.eflags);
 +
 + return EMULATE_DONE;
 +}
 +EXPORT_SYMBOL_GPL(inject_realmode_interrupt);
 +
  static int handle_emulation_failure(struct kvm_vcpu *vcpu)
  {
   ++vcpu-stat.insn_emulation_fail;
 diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
 index b7a4047..c6e8a4d 100644
 --- a/arch/x86/kvm/x86.h
 +++ b/arch/x86/kvm/x86.h
 @@ -67,5 +67,6 @@ static inline int is_paging(struct kvm_vcpu *vcpu)
  
  void kvm_before_handle_nmi(struct kvm_vcpu *vcpu);
  void kvm_after_handle_nmi(struct kvm_vcpu *vcpu);
 +int inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq);
  
  #endif
 -- 
 1.7.0.4
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: fix poison overwritten caused by using wrong xstate size

2010-08-15 Thread Avi Kivity

 On 08/13/2010 10:19 AM, Xiaotian Feng wrote:

fpu.state is allocated from task_xstate_cachep, the size of task_xstate_cachep
is xstate_size. xstate_size is set from cpuid instruction, which is often
smaller than sizeof(struct xsave_struct). kvm is using sizeof(struct 
xsave_struct)
to fill in/out fpu.state.xsave, as what we allocated for fpu.state is
xstate_size, kernel will write out of memory and caused poison/redzone/padding
overwritten warnings.


Thanks, applied.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86: Bail out on unemulated instructions

2010-08-15 Thread Gleb Natapov
On Sun, Aug 15, 2010 at 06:58:06PM +0300, Avi Kivity wrote:
  On 08/15/2010 03:49 PM, Gleb Natapov wrote:
 
 True. Can we do it for real mode in general (i.e. X86EMUL_MODE_REAL)?
 If we flush all shadow pages when moving from paged mode to non paged
 checking for X86EMUL_MODE_REAL sounds enough to me, but Avi knows better.
 Or we can add is_big_real_mode() callback to x86_ops and implement it in
 vmx accordingly.
 
 Neither are possible.  We can have one cpu in big real mode and
 others in paged mode, so even in real mode we cannot rule out a
 spurious page fault due to shadow write protection.
 
Correct, just checking X86EMUL_MODE_REAL is not enough due to smp, but
why checking for big real mode will not work? If instruction can't be
emulated while vcpu is in big real mode returning to vcpu is not an option,
so kvm will fail anyway.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 4/4] x86 emulator: Eagerly commit emulation ctxt eip in emulate_int_real()

2010-08-15 Thread Avi Kivity

 On 08/14/2010 03:19 AM, Mohammed Gamal wrote:

emulate_int_real() is to be used outside the emulator. Hence, we shouldn't
wait for writeback to write the eip value stored in the decode cache. Save it
in emulation context eagerly instead.

Signed-off-by: Mohammed Gamalm.gamal...@gmail.com
---
  arch/x86/kvm/emulate.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 32498e3..ae45b04 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -1245,7 +1245,7 @@ int emulate_int_real(struct x86_emulate_ctxt *ctxt,
if (rc != X86EMUL_CONTINUE)
return rc;

-   c-eip = eip;
+   ctxt-eip = eip;

return rc;
  }


Doesn't seem right.  It should work like the rest of the emulator.

Instead, the wrapper code in x86.c should do this.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86: Bail out on unemulated instructions

2010-08-15 Thread Gleb Natapov
On Sun, Aug 15, 2010 at 03:43:15PM +0300, Mohammed Gamal wrote:
 2010/8/15 Gleb Natapov g...@redhat.com:
  On Sun, Aug 15, 2010 at 03:40:00PM +0300, Mohammed Gamal wrote:
  On Sun, Aug 15, 2010 at 10:32 AM, Gleb Natapov g...@redhat.com wrote:
   On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote:
   If emulation fails due to the instruction being unemulated. Return 
   immediately
   instead of restarting the instruction and infinitely trying to execute 
   it.
  
   This is already handled correctly as far as I can see. Sometimes
   instruction should be retried and reexecute_instruction() checks
   for that case. If instruction emulation fails in big real mode
   re-executing instruction will be useless though, so what should be done
   is to make reexecute_instruction() return false if vcpu is in big real
   mode and cpu relies on emulation to handle it.
  We don't have a separate mode for big real mode. The emulation modes
  we have are real and vm86
 
  That doesn't makes the patch right. So we will have to figure something
  out.
 True. Can we do it for real mode in general (i.e. X86EMUL_MODE_REAL)?
If we flush all shadow pages when moving from paged mode to non paged
checking for X86EMUL_MODE_REAL sounds enough to me, but Avi knows better.
Or we can add is_big_real_mode() callback to x86_ops and implement it in
vmx accordingly.

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 0/4] Real mode interrupt injection

2010-08-15 Thread Avi Kivity

 On 08/14/2010 03:09 AM, Mohammed Gamal wrote:

This patch introduces real mode interrupt injection for VMX.
It currently invokes the x86 emulator to emulate interrupts
instead of manually setting VMX controls.

Needless to say, this is not meant for merging in its current state.
The emulator still needs some more work to get this completely operational.

Mohammed Gamal (3):
   x86 emulator: Expose emulate_int_real()
   x86: Add inject_realmode_interrupt() wrapper
   VMX: Emulated real mode interrupt injection

  arch/x86/include/asm/kvm_emulate.h |3 ++-
  arch/x86/kvm/vmx.c |   11 +--
  arch/x86/kvm/x86.c |   14 ++
  arch/x86/kvm/x86.h |1 +
  4 files changed, 18 insertions(+), 11 deletions(-)

---
Changes since v1:
- Save emulation context eip value early in emulate_int_real()
- Properly initialize emulation context in inject_realmode_interrupt()
- Implement error checks on using inject_realmode_interrupt()


Do those changes help your tests?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: make mmu_shrink() fit shrinker's requirement

2010-08-15 Thread Avi Kivity

 On 08/13/2010 11:10 PM, Dave Hansen wrote:

On Thu, 2010-08-05 at 12:28 +0300, Avi Kivity wrote:

On 08/04/2010 10:13 AM, Lai Jiangshan wrote:

mmu_shrink() should attempt to free @nr_to_scan entries.

This conflicts with Dave's patchset.

Dave, what's going on with those patches?  They're starting to smell.

These seem to fix the original problem reporter's issue.  They were run
with 64 guests on a 32GB machine.  No stability problems popped up in
this testing, or since I last sent the patches to you.  The results from
both the test with only the first four patches and with the entire set
of nine looked pretty identical.

That tells me that we should only push the first four for now:

abstract kvm x86 mmu-n_free_mmu_pages
rename x86 kvm-arch.n_alloc_mmu_pages
replace x86 kvm n_free_mmu_pages with n_used_mmu_pages
create aggregate kvm_total_used_mmu_pages value


Well, patches 3 and 4 have unaddressed review comments. Please fix them 
up. If you don't have the time, let me know and I'll do it instead.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86: Bail out on unemulated instructions

2010-08-15 Thread Gleb Natapov
On Sat, Aug 14, 2010 at 06:51:34PM +0300, Mohammed Gamal wrote:
 If emulation fails due to the instruction being unemulated. Return immediately
 instead of restarting the instruction and infinitely trying to execute it.
 
This is already handled correctly as far as I can see. Sometimes
instruction should be retried and reexecute_instruction() checks
for that case. If instruction emulation fails in big real mode
re-executing instruction will be useless though, so what should be done
is to make reexecute_instruction() return false if vcpu is in big real
mode and cpu relies on emulation to handle it.
 
 Signed-off-by: Mohammed Gamal m.gamal...@gmail.com
 ---
  arch/x86/kvm/x86.c |6 ++
  1 files changed, 6 insertions(+), 0 deletions(-)
 
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
 index 416aa0e..a31db44 100644
 --- a/arch/x86/kvm/x86.c
 +++ b/arch/x86/kvm/x86.c
 @@ -4036,6 +4036,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
   }
  
   ++vcpu-stat.insn_emulation;
 + if (r == X86EMUL_UNHANDLEABLE)
 + return handle_emulation_failure(vcpu);
 +
   if (r)  {
   if (reexecute_instruction(vcpu, cr2))
   return EMULATE_DONE;
 @@ -4057,6 +4060,9 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
  restart:
   r = x86_emulate_insn(vcpu-arch.emulate_ctxt);
  
 + if (r == X86EMUL_UNHANDLEABLE)
 + return handle_emulation_failure(vcpu);
 +
   if (r) { /* emulation failed */
   if (reexecute_instruction(vcpu, cr2))
   return EMULATE_DONE;
 -- 
 1.7.0.4
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: destroy workqueue on kvm_create_pit() failures

2010-08-15 Thread Avi Kivity

 On 08/13/2010 11:23 AM, Xiaotian Feng wrote:

kernel needs to destroy workqueue if kvm_create_pit() fails, otherwise
after pit is freed, the workqueue is leaked.


Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Fix bug for vcpu hotplug

2010-08-15 Thread Avi Kivity

 On 08/06/2010 06:36 AM, Liu, Jinsong wrote:

Recently seabios implement vcpu hotplug infrastructure.
During test, we found qemu-kvm has a bug result in guestos shutdown when vcpu 
hotadd.
This patch is to fix the bug, mark bus-allow_hotplug as 1 after qdev_hotplug 
init done.


Please copy qemu-devel on qemu patches.


@@ -117,6 +117,9 @@ DeviceState *qdev_create(BusState *bus, const char *name)
  hw_error(Unknown device '%s' for bus '%s'\n, name, bus-info-name);
  }

+if (qdev_hotplug)
+bus-allow_hotplug = 1;
+
  return qdev_create_from_info(bus, info);
  }


Doesn't seem right - this will set allow_hotplug on all busses.

It needs to be set only on the system bus (hw/sysbus.c).

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/3] Real mode interrupt injection

2010-08-15 Thread Mohammed Gamal
On Sun, Aug 15, 2010 at 3:23 PM, Avi Kivity a...@redhat.com wrote:
  On 08/12/2010 04:07 AM, Mohammed Gamal wrote:

 I was playing around with the non-atomic-injection branch. I decided
 to use e_i_g_s=1, and it's worth noting that I never experienced these
 faults with the switch enabled.

 Hate to spoil it. I did experience the faults again with e_i_g_s=1,
 although much less frequently.

 What is rather really strange, is that I could get a Linux guest to
 boot up completely both with e_i_g_s=1 and without it with the real
 mode interrupt patch enabled. It looks to me like the problem mainly
 happens when the BIOS tranfers control to the boot loader. Other
 guests usually fail.

 Would you like me to attach a trace?


 It will be much too big, upload it somewhere or send it to be privately.

 But, use the code with the interrupt injection setup fixed (see my comment
 to patch 2).

Please take a look at my latest patch series. We can take the discussion there.

 --
 error compiling committee.c: too many arguments to function


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: system_powerdown not working for qemu-kvm 0.12.4?

2010-08-15 Thread Teck Choon Giam

 That's fine - you'll be running upstream qemu instead of qemu-kvm.  Just
 remember to add -enable-kvm to the command line.  Use ./configure
 --target-list=x86_64-softtmmu to cut down on compile time.

Yes, I am doing so about the --target-list but missed out the
-enable-kvm command option prior to start each Guest OS.  Here is my
script:

---8---
#!/bin/sh

kerver=`uname -r`
KERNELDIR=--kerneldir=/usr/src/linux-`uname -r`
make clean
./configure --prefix=/usr/local/kvm ${KERNELDIR} \
--target-list=x86_64-softmmu

make
make INSTALL_MOD_STRIP=1 install
mkinitrd -v -f --builtin=ehci-hcd --builtin=uhci-hcd
--builtin=ohci-hcd --builtin=usb-storage /boot/initrd-${kerver}.img
${kerver}
depmod -v `uname -r`
---8---

Instead of rebooting... I do the following then start the FreeBSD
Guest OS to test:

1. Stop all Guest OSes.
2. rmmod -v kvm-intel
3. rmmod -v kvm
4. modprobe -v kvm
5. modprobe -v kvm-intel

If I add -enable-kvm to the command line to start the guest OS I got:

No SMP KVM support, use '-smp 1'
failed to initialize KVM

Well... ... changed to use '-smp 1' as shown and the FreeBSD 8.1 Guest
OS started without issues.  However for CentOS 5.5 Guest OS, since I
am using virtio... ... running in qemu instead of qemu-kvm will be an
issue?  I have problem to start the Guest OS and it always goes to the
CentOS installer since I have the following:

---SNAP---
-boot cd \
-cdrom /home/kvm/images/CentOS-5.5-x86_64-bin-1of8.iso \
-drive file=/dev/VM/centos5,if=virtio,boot=on \
---SNAP---

So I will not test on other Guest OS but just focus on FreeBSD instead.


 I'm betting 73b48d914f9 is the cause, but let's see the full bisect.


I don't know whether I have done it right or not about bisecting but
each good ones I just use git bisect good until the one that is
failing then I use git bisect bad.  The following are for git bisect
good qemu-kvm-0.12.1.2 and git bisect bad qemu-kvm-0.12.2:

# git bisect log
git bisect start
# good: [a6ec654a7a863afff41b491a02ffd696c862cb41] Merge branch
'stable-0.12-upstream' into stable-0.12
git bisect good a6ec654a7a863afff41b491a02ffd696c862cb41
# bad: [c01cfac552861ca4d82e359791a2d79da7f80cb5] device assignment:
default requires IOMMU
git bisect bad c01cfac552861ca4d82e359791a2d79da7f80cb5
# good: [66dbb62824845e91808171a675998706ce359c71] Handle TFTP ERROR from client
git bisect good 66dbb62824845e91808171a675998706ce359c71
# good: [04babf6c6f8ccf69f1219db5fea233d679702e90] roms: rework rom
loading via fw
git bisect good 04babf6c6f8ccf69f1219db5fea233d679702e90
# good: [3999bf32440c1ea2ceb85eef008cc56a069af13f] Qemu's internal
TFTP server breaks lock-step-iness of TFTP
git bisect good 3999bf32440c1ea2ceb85eef008cc56a069af13f
# good: [e389e937a7b94186449e0590bdc8f04ecbb1ab0b] Update version and
changelog for release
git bisect good e389e937a7b94186449e0590bdc8f04ecbb1ab0b
# bad: [b874ce1db7d8654850c8a6606b95ffb1c7d22ce2] Merge remote branch
'upstream/stable-0.12' into stable-0.12
git bisect bad b874ce1db7d8654850c8a6606b95ffb1c7d22ce2


The following are git bisect log between git bisect good
qemu-kvm-0.12.1.2 and git bisect bad qemu-kvm-0.12.3 instead:

git bisect start
# good: [a6ec654a7a863afff41b491a02ffd696c862cb41] Merge branch
'stable-0.12-upstream' into stable-0.12
git bisect good a6ec654a7a863afff41b491a02ffd696c862cb41
# bad: [69a5ecafa27daeb943dc2ee65b1470844f23f934] Merge branch
'stable-0.12-merge' into stable-0.12
git bisect bad 69a5ecafa27daeb943dc2ee65b1470844f23f934
# good: [d0d888bc6d1a106609b9af42ecb552c6c34a85c5] qcow2: Return
0/-errno in qcow2_alloc_cluster_offset
git bisect good d0d888bc6d1a106609b9af42ecb552c6c34a85c5
# good: [7ebc79037c5f426bfb08cc506670bf7dd3912430] virtio-net: fix
network stall under load
git bisect good 7ebc79037c5f426bfb08cc506670bf7dd3912430
# good: [6173d56bdcb53389c54e803873e6bf8f87836a4f] Merge remote branch
'qemu-kvm/uq/stable-0.12' into stable-0.12
git bisect good 6173d56bdcb53389c54e803873e6bf8f87836a4f
# bad: [59691c0cb129c9aa955be22573b43b26534f9db4] KVM: Request setting
of nmi_pending and sipi_vector
git bisect bad 59691c0cb129c9aa955be22573b43b26534f9db4
# bad: [dec2eb9d724b21581500aea911dd13f7bfbea59e] Fix kvm_load_mpstate
for vcpu hot add
git bisect bad dec2eb9d724b21581500aea911dd13f7bfbea59e


Let me know if I have done anything wrong about the bisecting... ...

Thanks.

Kindest regards,
Giam Teck Choon
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] test: add test for bsf/bsr instruction

2010-08-15 Thread Avi Kivity
 On 08/09/2010 01:01 PM, Wei Yongjun wrote:
 This patch add test for bsf/bsr instruction.


Applied, thanks.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hot plug memory in guest

2010-08-15 Thread Avi Kivity

 On 08/10/2010 05:53 PM, Gu, Zhongshu wrote:

Hi all:
 I want to dynamically register memory into the linux guest
during runtime. I will compile linux kernel with sparse memory model
support. Does kvm support that kind of function? I am not sure how
linux detect physical memory and how does memslot mapped to physical
memory? Through mc146818?


Memory hotplug is not yet supported.  Have a look at the balloon driver 
for similar functionality.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: fix poison overwritten caused by using wrong xstate size

2010-08-15 Thread Avi Kivity

 On 08/14/2010 12:03 AM, H. Peter Anvin wrote:

Avi, do you want to take this one or should I?


I will, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] x86: Bail out on unemulated instructions

2010-08-15 Thread Avi Kivity

 On 08/15/2010 07:11 PM, Gleb Natapov wrote:



Neither are possible.  We can have one cpu in big real mode and
others in paged mode, so even in real mode we cannot rule out a
spurious page fault due to shadow write protection.


Correct, just checking X86EMUL_MODE_REAL is not enough due to smp, but
why checking for big real mode will not work? If instruction can't be
emulated while vcpu is in big real mode returning to vcpu is not an option,
so kvm will fail anyway.


Right.  I guess we can have an emulation_reason variable which explains 
why we are emulating (unvirtualizable state, mmu page fault, mmio page 
fault, unvirtualizable instruction) and decide accordingly.  But it's a 
lot of work.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: system_powerdown not working for qemu-kvm 0.12.4?

2010-08-15 Thread Avi Kivity

 On 08/15/2010 07:15 PM, Teck Choon Giam wrote:


Let me know if I have done anything wrong about the bisecting... ...



All looks fine, but what are the results?  git should say something like 
'x is first bad commit' which is the interesting part.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


MegaSAS 8708EM2 qemu-kvm.git tree updated to v0.12.5

2010-08-15 Thread Nicholas A. Bellinger
Greetings Hannes, hch and Co,

The lastest code from upstream qemu-kvm.git v0.12.5 has been merged into
the megasas HBA emulation friendly qemu-kvm.git/master and scsi-bsg
branches at:

http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=summary

The merge commitdiffs for master and scsi-bsg can be found here:

http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=commitdiff;h=331578e7e362b33c965d469ea4577956dd431bbc
http://git.kernel.org/?p=virt/kvm/nab/qemu-kvm.git;a=commitdiff;h=2eebcfd04adda1fee641a776c9a85dda95c43b43

The megasas HBA emulation has been given a quick test with scsi-generic
and scsi-bsg backstores into TCM_Loop FILEIO LUNs with x86_64 v2.6.26
guests on a x86_64 v2.6.35 host.  So far things appear to be functioning
as expected with the megasas SGL passthrough logic and v0.12.5 upstream
qemu-kvm code.

There was also some new upstream code in hw/scsi-disk.c which does not
seem to be required with the SGL passthrough logic, and that I ended up
dropping for the v0.12.5 merge.  The code that that was dropped during
the merge starts with scsi_command_complete() at:

http://git.kernel.org/?p=virt/kvm/qemu-kvm.git;a=blob;f=hw/scsi-disk.c;hb=HEAD#l101

and includes everything down to scsi_write_data().  Using scsi-disk was
also given a quick test and appears to be functioning as expected using
passthrough SGL logic and userspace QEMU SCSI CDB emulation.

Note there does appear to be some breakage with the SGL passthrough and
recent upstream changes with hw/lsi53c895a.c to use a local dma_buf
pointer and to get rid of LSIState-select_dev.  While I had verified
that the SGL passthrough code was working with lsi53c895a on v0.12.4
with scsi-generic+scsi-bsg backstores, this is now segfaulting for me
after the latest upstream merge.  I will need to have another look at
this, but if someone who has more knowledge with hw/lsi53c895a.c could
help out, it would be much apperciated.  ;)

Comments are welcome!

Best,

--nab

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 2/2] RESEND: Disable build of ivshmem on non-KVM systems

2010-08-15 Thread Blue Swirl
Thanks, applied.

On Sat, Aug 14, 2010 at 11:47 PM, Cam Macdonell c...@cs.ualberta.ca wrote:
 Signed-off-by: Cam Macdonell c...@cs.ualberta.ca
 ---
  Makefile.target |    2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/Makefile.target b/Makefile.target
 index b791492..c8281e9 100644
 --- a/Makefile.target
 +++ b/Makefile.target
 @@ -191,7 +191,7 @@ obj-y += rtl8139.o
  obj-y += e1000.o

  # Inter-VM PCI shared memory
 -obj-y += ivshmem.o
 +obj-$(CONFIG_KVM) += ivshmem.o

  # Hardware support
  obj-i386-y += vga.o
 --
 1.6.2.5



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/7] AMD IOMMU emulation patches v3

2010-08-15 Thread Eduard - Gabriel Munteanu
Hi,

Please have a look at these and merge if you wish. I hope I've addressed the
issues people have raised.

Some changes from the previous RFC:
- included and updated the other two device patches
- moved map registration and invalidation management into PCI code
- AMD IOMMU emulation is always enabled (no more configure options)
- cleaned up code, I now use typedefs as suggested
- event logging cleanups

BTW, the change to pci_regs.h is properly aligned but the original file contains
tabs.


Cheers,
Eduard

Eduard - Gabriel Munteanu (7):
  pci: add range_covers_range()
  pci: memory access API and IOMMU support
  AMD IOMMU emulation
  ide: use the PCI memory access interface
  rtl8139: use the PCI memory access interface
  eepro100: use the PCI memory access interface
  ac97: use the PCI memory access interface

 Makefile.target   |2 +
 dma-helpers.c |   46 -
 dma.h |   21 ++-
 hw/ac97.c |6 +-
 hw/amd_iommu.c|  688 +
 hw/eepro100.c |   78 ---
 hw/ide/core.c |   15 +-
 hw/ide/internal.h |   39 +++
 hw/ide/pci.c  |7 +
 hw/pc.c   |2 +
 hw/pci.c  |  197 +++-
 hw/pci.h  |   84 +++
 hw/pci_ids.h  |2 +
 hw/pci_regs.h |1 +
 hw/rtl8139.c  |   99 +
 qemu-common.h |1 +
 16 files changed, 1191 insertions(+), 97 deletions(-)
 create mode 100644 hw/amd_iommu.c

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/7] pci: add range_covers_range()

2010-08-15 Thread Eduard - Gabriel Munteanu
This helper function allows map invalidation code to determine which
maps must be invalidated.

Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro
---
 hw/pci.h |   10 ++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/hw/pci.h b/hw/pci.h
index 4bd8a1a..5a6cdb5 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -419,6 +419,16 @@ static inline int range_covers_byte(uint64_t offset, 
uint64_t len,
 return offset = byte  byte = range_get_last(offset, len);
 }
 
+/* Check whether a given range completely covers another. */
+static inline int range_covers_range(uint64_t first_big, uint64_t len_big,
+ uint64_t first_small, uint64_t len_small)
+{
+uint64_t last_big = range_get_last(first_big, len_big);
+uint64_t last_small = range_get_last(first_small, len_small);
+
+return first_big = first_small  last_small = last_big;
+}
+
 /* Check whether 2 given ranges overlap.
  * Undefined if ranges that wrap around 0. */
 static inline int ranges_overlap(uint64_t first1, uint64_t len1,
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/7] pci: memory access API and IOMMU support

2010-08-15 Thread Eduard - Gabriel Munteanu
PCI devices should access memory through pci_memory_*() instead of
cpu_physical_memory_*(). This also provides support for translation and
access checking in case an IOMMU is emulated.

Memory maps are treated as remote IOTLBs (that is, translation caches
belonging to the IOMMU-aware device itself). Clients (devices) must
provide callbacks for map invalidation in case these maps are
persistent beyond the current I/O context, e.g. AIO DMA transfers.

Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro
---
 hw/pci.c  |  197 -
 hw/pci.h  |   74 +
 qemu-common.h |1 +
 3 files changed, 271 insertions(+), 1 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index 6871728..8668e06 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -58,6 +58,18 @@ struct PCIBus {
Keep a count of the number of devices with raised IRQs.  */
 int nirq;
 int *irq_count;
+
+PCIDevice   *iommu;
+PCITranslateFunc*translate;
+};
+
+struct PCIMemoryMap {
+pcibus_taddr;
+pcibus_tlen;
+target_phys_addr_t  paddr;
+PCIInvalidateMapFunc*invalidate;
+void*invalidate_opaque;
+QLIST_ENTRY(PCIMemoryMap)   list;
 };
 
 static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent);
@@ -166,6 +178,19 @@ static void pci_device_reset(PCIDevice *dev)
 pci_update_mappings(dev);
 }
 
+static int pci_no_translate(PCIDevice *iommu,
+PCIDevice *dev,
+pcibus_t addr,
+target_phys_addr_t *paddr,
+target_phys_addr_t *len,
+unsigned perms)
+{
+*paddr = addr;
+*len = -1;
+
+return 0;
+}
+
 static void pci_bus_reset(void *opaque)
 {
 PCIBus *bus = opaque;
@@ -227,7 +252,10 @@ void pci_bus_new_inplace(PCIBus *bus, DeviceState *parent,
  const char *name, int devfn_min)
 {
 qbus_create_inplace(bus-qbus, pci_bus_info, parent, name);
-bus-devfn_min = devfn_min;
+
+bus-devfn_min  = devfn_min;
+bus-iommu  = NULL;
+bus-translate  = pci_no_translate;
 
 /* host bridge */
 QLIST_INIT(bus-child);
@@ -2029,6 +2057,173 @@ static void pcibus_dev_print(Monitor *mon, DeviceState 
*dev, int indent)
 }
 }
 
+void pci_register_iommu(PCIDevice *iommu,
+PCITranslateFunc *translate)
+{
+iommu-bus-iommu = iommu;
+iommu-bus-translate = translate;
+}
+
+void pci_memory_rw(PCIDevice *dev,
+   pcibus_t addr,
+   uint8_t *buf,
+   pcibus_t len,
+   int is_write)
+{
+int err;
+unsigned perms;
+PCIDevice *iommu = dev-bus-iommu;
+target_phys_addr_t paddr, plen;
+
+perms = is_write ? IOMMU_PERM_WRITE : IOMMU_PERM_READ;
+
+while (len) {
+err = dev-bus-translate(iommu, dev, addr, paddr, plen, perms);
+if (err)
+return;
+
+/* The translation might be valid for larger regions. */
+if (plen  len)
+plen = len;
+
+cpu_physical_memory_rw(paddr, buf, plen, is_write);
+
+len -= plen;
+addr += plen;
+buf += plen;
+}
+}
+
+static void pci_memory_register_map(PCIDevice *dev,
+pcibus_t addr,
+pcibus_t len,
+target_phys_addr_t paddr,
+PCIInvalidateMapFunc *invalidate,
+void *invalidate_opaque)
+{
+PCIMemoryMap *map;
+
+map = qemu_malloc(sizeof(PCIMemoryMap));
+map-addr   = addr;
+map-len= len;
+map-paddr  = paddr;
+map-invalidate = invalidate;
+map-invalidate_opaque  = invalidate_opaque;
+
+QLIST_INSERT_HEAD(dev-memory_maps, map, list);
+}
+
+static void pci_memory_unregister_map(PCIDevice *dev,
+  target_phys_addr_t paddr,
+  target_phys_addr_t len)
+{
+PCIMemoryMap *map;
+
+QLIST_FOREACH(map, dev-memory_maps, list) {
+if (map-paddr == paddr  map-len == len) {
+QLIST_REMOVE(map, list);
+free(map);
+}
+}
+}
+
+void pci_memory_invalidate_range(PCIDevice *dev,
+ pcibus_t addr,
+ pcibus_t len)
+{
+PCIMemoryMap *map;
+
+QLIST_FOREACH(map, dev-memory_maps, list) {
+if (range_covers_range(addr, len, map-addr, map-len)) {
+map-invalidate(map-invalidate_opaque);
+QLIST_REMOVE(map, list);
+free(map);
+}
+}
+}
+
+void *pci_memory_map(PCIDevice *dev,
+ 

[PATCH 4/7] ide: use the PCI memory access interface

2010-08-15 Thread Eduard - Gabriel Munteanu
Emulated PCI IDE controllers now use the memory access interface. This
also allows an emulated IOMMU to translate and check accesses.

Map invalidation results in cancelling DMA transfers. Since the guest OS
can't properly recover the DMA results in case the mapping is changed,
this is a fairly good approximation.

Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro
---
 dma-helpers.c |   46 +-
 dma.h |   21 -
 hw/ide/core.c |   15 ---
 hw/ide/internal.h |   39 +++
 hw/ide/pci.c  |7 +++
 5 files changed, 115 insertions(+), 13 deletions(-)

diff --git a/dma-helpers.c b/dma-helpers.c
index d4fc077..9c3a21a 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -10,12 +10,36 @@
 #include dma.h
 #include block_int.h
 
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint)
+static void *qemu_sglist_default_map(void *opaque,
+ QEMUSGInvalMapFunc *inval_cb,
+ void *inval_opaque,
+ target_phys_addr_t addr,
+ target_phys_addr_t *len,
+ int is_write)
+{
+return cpu_physical_memory_map(addr, len, is_write);
+}
+
+static void qemu_sglist_default_unmap(void *opaque,
+  void *buffer,
+  target_phys_addr_t len,
+  int is_write,
+  target_phys_addr_t access_len)
+{
+cpu_physical_memory_unmap(buffer, len, is_write, access_len);
+}
+
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint,
+  QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap, void *opaque)
 {
 qsg-sg = qemu_malloc(alloc_hint * sizeof(ScatterGatherEntry));
 qsg-nsg = 0;
 qsg-nalloc = alloc_hint;
 qsg-size = 0;
+
+qsg-map = map ? map : qemu_sglist_default_map;
+qsg-unmap = unmap ? unmap : qemu_sglist_default_unmap;
+qsg-opaque = opaque;
 }
 
 void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
@@ -73,12 +97,23 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
 int i;
 
 for (i = 0; i  dbs-iov.niov; ++i) {
-cpu_physical_memory_unmap(dbs-iov.iov[i].iov_base,
-  dbs-iov.iov[i].iov_len, !dbs-is_write,
-  dbs-iov.iov[i].iov_len);
+dbs-sg-unmap(dbs-sg-opaque,
+   dbs-iov.iov[i].iov_base,
+   dbs-iov.iov[i].iov_len, !dbs-is_write,
+   dbs-iov.iov[i].iov_len);
 }
 }
 
+static void dma_bdrv_cancel(void *opaque)
+{
+DMAAIOCB *dbs = opaque;
+
+bdrv_aio_cancel(dbs-acb);
+dma_bdrv_unmap(dbs);
+qemu_iovec_destroy(dbs-iov);
+qemu_aio_release(dbs);
+}
+
 static void dma_bdrv_cb(void *opaque, int ret)
 {
 DMAAIOCB *dbs = (DMAAIOCB *)opaque;
@@ -100,7 +135,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
 while (dbs-sg_cur_index  dbs-sg-nsg) {
 cur_addr = dbs-sg-sg[dbs-sg_cur_index].base + dbs-sg_cur_byte;
 cur_len = dbs-sg-sg[dbs-sg_cur_index].len - dbs-sg_cur_byte;
-mem = cpu_physical_memory_map(cur_addr, cur_len, !dbs-is_write);
+mem = dbs-sg-map(dbs-sg-opaque, dma_bdrv_cancel, dbs,
+   cur_addr, cur_len, !dbs-is_write);
 if (!mem)
 break;
 qemu_iovec_add(dbs-iov, mem, cur_len);
diff --git a/dma.h b/dma.h
index f3bb275..d48f35c 100644
--- a/dma.h
+++ b/dma.h
@@ -15,6 +15,19 @@
 #include hw/hw.h
 #include block.h
 
+typedef void QEMUSGInvalMapFunc(void *opaque);
+typedef void *QEMUSGMapFunc(void *opaque,
+QEMUSGInvalMapFunc *inval_cb,
+void *inval_opaque,
+target_phys_addr_t addr,
+target_phys_addr_t *len,
+int is_write);
+typedef void QEMUSGUnmapFunc(void *opaque,
+ void *buffer,
+ target_phys_addr_t len,
+ int is_write,
+ target_phys_addr_t access_len);
+
 typedef struct {
 target_phys_addr_t base;
 target_phys_addr_t len;
@@ -25,9 +38,15 @@ typedef struct {
 int nsg;
 int nalloc;
 target_phys_addr_t size;
+
+QEMUSGMapFunc *map;
+QEMUSGUnmapFunc *unmap;
+void *opaque;
 } QEMUSGList;
 
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint);
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint,
+  QEMUSGMapFunc *map, QEMUSGUnmapFunc *unmap,
+  void *opaque);
 void qemu_sglist_add(QEMUSGList *qsg, target_phys_addr_t base,
  target_phys_addr_t len);
 void qemu_sglist_destroy(QEMUSGList *qsg);
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 

[PATCH 3/7] AMD IOMMU emulation

2010-08-15 Thread Eduard - Gabriel Munteanu
This introduces emulation for the AMD IOMMU, described in AMD I/O
Virtualization Technology (IOMMU) Specification.

Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro
---
 Makefile.target |2 +
 hw/amd_iommu.c  |  688 +++
 hw/pc.c |2 +
 hw/pci_ids.h|2 +
 hw/pci_regs.h   |1 +
 5 files changed, 695 insertions(+), 0 deletions(-)
 create mode 100644 hw/amd_iommu.c

diff --git a/Makefile.target b/Makefile.target
index 70a9c1b..6b80a37 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -219,6 +219,8 @@ obj-i386-y += pcspk.o i8254.o
 obj-i386-$(CONFIG_KVM_PIT) += i8254-kvm.o
 obj-i386-$(CONFIG_KVM_DEVICE_ASSIGNMENT) += device-assignment.o
 
+obj-i386-y += amd_iommu.o
+
 # Hardware support
 obj-ia64-y += ide.o pckbd.o vga.o $(SOUND_HW) dma.o $(AUDIODRV)
 obj-ia64-y += fdc.o mc146818rtc.o serial.o i8259.o ipf.o
diff --git a/hw/amd_iommu.c b/hw/amd_iommu.c
new file mode 100644
index 000..2e20888
--- /dev/null
+++ b/hw/amd_iommu.c
@@ -0,0 +1,688 @@
+/*
+ * AMD IOMMU emulation
+ *
+ * Copyright (c) 2010 Eduard - Gabriel Munteanu
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include pc.h
+#include hw.h
+#include pci.h
+#include qlist.h
+
+/* Capability registers */
+#define CAPAB_HEADER0x00
+#define   CAPAB_REV_TYPE0x02
+#define   CAPAB_FLAGS   0x03
+#define CAPAB_BAR_LOW   0x04
+#define CAPAB_BAR_HIGH  0x08
+#define CAPAB_RANGE 0x0C
+#define CAPAB_MISC  0x10
+
+#define CAPAB_SIZE  0x14
+
+/* Capability header data */
+#define CAPAB_FLAG_IOTLBSUP (1  0)
+#define CAPAB_FLAG_HTTUNNEL (1  1)
+#define CAPAB_FLAG_NPCACHE  (1  2)
+#define CAPAB_INIT_REV  (1  3)
+#define CAPAB_INIT_TYPE 3
+#define CAPAB_INIT_REV_TYPE (CAPAB_REV | CAPAB_TYPE)
+#define CAPAB_INIT_FLAGS(CAPAB_FLAG_NPCACHE | CAPAB_FLAG_HTTUNNEL)
+#define CAPAB_INIT_MISC (64  15) | (48  8)
+#define CAPAB_BAR_MASK  ~((1UL  14) - 1)
+
+/* MMIO registers */
+#define MMIO_DEVICE_TABLE   0x
+#define MMIO_COMMAND_BASE   0x0008
+#define MMIO_EVENT_BASE 0x0010
+#define MMIO_CONTROL0x0018
+#define MMIO_EXCL_BASE  0x0020
+#define MMIO_EXCL_LIMIT 0x0028
+#define MMIO_COMMAND_HEAD   0x2000
+#define MMIO_COMMAND_TAIL   0x2008
+#define MMIO_EVENT_HEAD 0x2010
+#define MMIO_EVENT_TAIL 0x2018
+#define MMIO_STATUS 0x2020
+
+#define MMIO_SIZE   0x4000
+
+#define MMIO_DEVTAB_SIZE_MASK   ((1ULL  12) - 1)
+#define MMIO_DEVTAB_BASE_MASK   (((1ULL  52) - 1)  ~MMIO_DEVTAB_SIZE_MASK)
+#define MMIO_DEVTAB_ENTRY_SIZE  32
+#define MMIO_DEVTAB_SIZE_UNIT   4096
+
+#define MMIO_CMDBUF_SIZE_BYTE   (MMIO_COMMAND_BASE + 7)
+#define MMIO_CMDBUF_SIZE_MASK   0x0F
+#define MMIO_CMDBUF_BASE_MASK   MMIO_DEVTAB_BASE_MASK
+#define MMIO_CMDBUF_DEFAULT_SIZE8
+#define MMIO_CMDBUF_HEAD_MASK   (((1ULL  19) - 1)  ~0x0F)
+#define MMIO_CMDBUF_TAIL_MASK   MMIO_EVTLOG_HEAD_MASK
+
+#define MMIO_EVTLOG_SIZE_BYTE   (MMIO_EVENT_BASE + 7)
+#define MMIO_EVTLOG_SIZE_MASK   MMIO_CMDBUF_SIZE_MASK
+#define MMIO_EVTLOG_BASE_MASK   MMIO_CMDBUF_BASE_MASK
+#define MMIO_EVTLOG_DEFAULT_SIZEMMIO_CMDBUF_DEFAULT_SIZE
+#define MMIO_EVTLOG_HEAD_MASK   (((1ULL  19) - 1)  ~0x0F)
+#define MMIO_EVTLOG_TAIL_MASK   MMIO_EVTLOG_HEAD_MASK
+
+#define MMIO_EXCL_BASE_MASK MMIO_DEVTAB_BASE_MASK
+#define MMIO_EXCL_ENABLED_MASK  (1ULL  0)
+#define MMIO_EXCL_ALLOW_MASK(1ULL  1)
+#define MMIO_EXCL_LIMIT_MASKMMIO_DEVTAB_BASE_MASK
+#define MMIO_EXCL_LIMIT_LOW 0xFFF
+
+#define MMIO_CONTROL_IOMMUEN(1ULL  0)
+#define MMIO_CONTROL_HTTUNEN(1ULL  1)
+#define MMIO_CONTROL_EVENTLOGEN (1ULL  2)
+#define MMIO_CONTROL_EVENTINTEN (1ULL  3)
+#define MMIO_CONTROL_COMWAITINTEN   (1ULL  4)
+#define 

[PATCH 6/7] eepro100: use the PCI memory access interface

2010-08-15 Thread Eduard - Gabriel Munteanu
This allows the device to work properly with an emulated IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro
---
 hw/eepro100.c |   78 ++---
 1 files changed, 41 insertions(+), 37 deletions(-)

diff --git a/hw/eepro100.c b/hw/eepro100.c
index 97afa2c..6e23271 100644
--- a/hw/eepro100.c
+++ b/hw/eepro100.c
@@ -306,10 +306,10 @@ static const uint16_t eepro100_mdi_mask[] = {
 };
 
 /* XXX: optimize */
-static void stl_le_phys(target_phys_addr_t addr, uint32_t val)
+static void stl_le_phys(EEPRO100State * s, pcibus_t addr, uint32_t val)
 {
 val = cpu_to_le32(val);
-cpu_physical_memory_write(addr, (const uint8_t *)val, sizeof(val));
+pci_memory_write(s-dev, addr, (const uint8_t *)val, sizeof(val));
 }
 
 #define POLYNOMIAL 0x04c11db6
@@ -692,12 +692,12 @@ static void dump_statistics(EEPRO100State * s)
  * values which really matter.
  * Number of data should check configuration!!!
  */
-cpu_physical_memory_write(s-statsaddr,
-  (uint8_t *)  s-statistics, s-stats_size);
-stl_le_phys(s-statsaddr + 0, s-statistics.tx_good_frames);
-stl_le_phys(s-statsaddr + 36, s-statistics.rx_good_frames);
-stl_le_phys(s-statsaddr + 48, s-statistics.rx_resource_errors);
-stl_le_phys(s-statsaddr + 60, s-statistics.rx_short_frame_errors);
+pci_memory_write(s-dev, s-statsaddr,
+ (uint8_t *)  s-statistics, s-stats_size);
+stl_le_phys(s, s-statsaddr + 0, s-statistics.tx_good_frames);
+stl_le_phys(s, s-statsaddr + 36, s-statistics.rx_good_frames);
+stl_le_phys(s, s-statsaddr + 48, s-statistics.rx_resource_errors);
+stl_le_phys(s, s-statsaddr + 60, s-statistics.rx_short_frame_errors);
 #if 0
 stw_le_phys(s-statsaddr + 76, s-statistics.xmt_tco_frames);
 stw_le_phys(s-statsaddr + 78, s-statistics.rcv_tco_frames);
@@ -707,7 +707,8 @@ static void dump_statistics(EEPRO100State * s)
 
 static void read_cb(EEPRO100State *s)
 {
-cpu_physical_memory_read(s-cb_address, (uint8_t *) s-tx, sizeof(s-tx));
+pci_memory_read(s-dev,
+s-cb_address, (uint8_t *) s-tx, sizeof(s-tx));
 s-tx.status = le16_to_cpu(s-tx.status);
 s-tx.command = le16_to_cpu(s-tx.command);
 s-tx.link = le32_to_cpu(s-tx.link);
@@ -737,18 +738,18 @@ static void tx_command(EEPRO100State *s)
 }
 assert(tcb_bytes = sizeof(buf));
 while (size  tcb_bytes) {
-uint32_t tx_buffer_address = ldl_phys(tbd_address);
-uint16_t tx_buffer_size = lduw_phys(tbd_address + 4);
+uint32_t tx_buffer_address = pci_ldl(s-dev, tbd_address);
+uint16_t tx_buffer_size = pci_lduw(s-dev, tbd_address + 4);
 #if 0
-uint16_t tx_buffer_el = lduw_phys(tbd_address + 6);
+uint16_t tx_buffer_el = pci_lduw(s-dev, tbd_address + 6);
 #endif
 tbd_address += 8;
 TRACE(RXTX, logout
 (TBD (simplified mode): buffer address 0x%08x, size 0x%04x\n,
  tx_buffer_address, tx_buffer_size));
 tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size);
-cpu_physical_memory_read(tx_buffer_address, buf[size],
- tx_buffer_size);
+pci_memory_read(s-dev,
+tx_buffer_address, buf[size], tx_buffer_size);
 size += tx_buffer_size;
 }
 if (tbd_array == 0x) {
@@ -759,16 +760,16 @@ static void tx_command(EEPRO100State *s)
 if (s-has_extended_tcb_support  !(s-configuration[6]  BIT(4))) {
 /* Extended Flexible TCB. */
 for (; tbd_count  2; tbd_count++) {
-uint32_t tx_buffer_address = ldl_phys(tbd_address);
-uint16_t tx_buffer_size = lduw_phys(tbd_address + 4);
-uint16_t tx_buffer_el = lduw_phys(tbd_address + 6);
+uint32_t tx_buffer_address = pci_ldl(s-dev, tbd_address);
+uint16_t tx_buffer_size = pci_lduw(s-dev, tbd_address + 4);
+uint16_t tx_buffer_el = pci_lduw(s-dev, tbd_address + 6);
 tbd_address += 8;
 TRACE(RXTX, logout
 (TBD (extended flexible mode): buffer address 0x%08x, 
size 0x%04x\n,
  tx_buffer_address, tx_buffer_size));
 tx_buffer_size = MIN(tx_buffer_size, sizeof(buf) - size);
-cpu_physical_memory_read(tx_buffer_address, buf[size],
- tx_buffer_size);
+pci_memory_read(s-dev,
+tx_buffer_address, buf[size], tx_buffer_size);
 size += tx_buffer_size;
 if (tx_buffer_el  1) {
 break;
@@ -777,16 +778,16 @@ static void tx_command(EEPRO100State *s)
 }
 tbd_address = tbd_array;
 for (; tbd_count  s-tx.tbd_count; tbd_count++) {
-uint32_t tx_buffer_address = ldl_phys(tbd_address);
-uint16_t tx_buffer_size = 

[PATCH 7/7] ac97: use the PCI memory access interface

2010-08-15 Thread Eduard - Gabriel Munteanu
This allows the device to work properly with an emulated IOMMU.

Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro
---
 hw/ac97.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/ac97.c b/hw/ac97.c
index 4319bc8..9ee4894 100644
--- a/hw/ac97.c
+++ b/hw/ac97.c
@@ -223,7 +223,7 @@ static void fetch_bd (AC97LinkState *s, AC97BusMasterRegs 
*r)
 {
 uint8_t b[8];
 
-cpu_physical_memory_read (r-bdbar + r-civ * 8, b, 8);
+pci_memory_read (s-dev, r-bdbar + r-civ * 8, b, 8);
 r-bd_valid = 1;
 r-bd.addr = le32_to_cpu (*(uint32_t *) b[0])  ~3;
 r-bd.ctl_len = le32_to_cpu (*(uint32_t *) b[4]);
@@ -972,7 +972,7 @@ static int write_audio (AC97LinkState *s, AC97BusMasterRegs 
*r,
 while (temp) {
 int copied;
 to_copy = audio_MIN (temp, sizeof (tmpbuf));
-cpu_physical_memory_read (addr, tmpbuf, to_copy);
+pci_memory_read (s-dev, addr, tmpbuf, to_copy);
 copied = AUD_write (s-voice_po, tmpbuf, to_copy);
 dolog (write_audio max=%x to_copy=%x copied=%x\n,
max, to_copy, copied);
@@ -1056,7 +1056,7 @@ static int read_audio (AC97LinkState *s, 
AC97BusMasterRegs *r,
 *stop = 1;
 break;
 }
-cpu_physical_memory_write (addr, tmpbuf, acquired);
+pci_memory_write (s-dev, addr, tmpbuf, acquired);
 temp -= acquired;
 addr += acquired;
 nread += acquired;
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Split region allocation code from pci_bios_init_device()

2010-08-15 Thread Eduard - Gabriel Munteanu
pci_bios_alloc() can be used to allocate space in the PCI region for
other purposes. This is needed by the AMD IOMMU support code.

Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro
---
 src/pciinit.c |   17 +
 1 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/pciinit.c b/src/pciinit.c
index 0556ee2..bfc669f 100644
--- a/src/pciinit.c
+++ b/src/pciinit.c
@@ -75,6 +75,16 @@ static void pci_bios_init_bridges(u16 bdf)
 }
 }
 
+static inline u32 pci_bios_alloc(u32 *region, u32 size)
+{
+u32 ret;
+
+ret = ALIGN(*region, size);
+*region = ret + size;
+
+return ret;
+}
+
 static void pci_bios_init_device(u16 bdf)
 {
 int class;
@@ -146,14 +156,13 @@ static void pci_bios_init_device(u16 bdf)
 pci_config_writel(bdf, ofs, old);
 
 if (val != 0) {
-u32 size = (~(val  mask)) + 1;
+u32 base, size = (~(val  mask)) + 1;
 if (val  PCI_BASE_ADDRESS_SPACE_IO)
 paddr = pci_bios_io_addr;
 else
 paddr = pci_bios_mem_addr;
-*paddr = ALIGN(*paddr, size);
-pci_set_io_region_addr(bdf, i, *paddr);
-*paddr += size;
+base = pci_bios_alloc(paddr, size);
+pci_set_io_region_addr(bdf, i, base);
 }
 }
 break;
-- 
1.7.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] AMD IOMMU support

2010-08-15 Thread Eduard - Gabriel Munteanu
This initializes the AMD IOMMU and creates ACPI tables for it.

Signed-off-by: Eduard - Gabriel Munteanu eduard.munte...@linux360.ro
---
 Makefile   |2 +-
 src/acpi.c |   79 
 src/iommu.c|   64 +
 src/iommu.h|   12 
 src/pci.h  |4 +++
 src/pci_ids.h  |1 +
 src/pci_regs.h |1 +
 src/pciinit.c  |   11 
 8 files changed, 173 insertions(+), 1 deletions(-)
 create mode 100644 src/iommu.c
 create mode 100644 src/iommu.h

diff --git a/Makefile b/Makefile
index fe0c1ce..98f253d 100644
--- a/Makefile
+++ b/Makefile
@@ -14,7 +14,7 @@ OUT=out/
 SRCBOTH=misc.c pmm.c stacks.c output.c util.c block.c floppy.c ata.c mouse.c \
 kbd.c pci.c serial.c clock.c pic.c cdrom.c ps2port.c smp.c resume.c \
 pnpbios.c pirtable.c vgahooks.c ramdisk.c pcibios.c blockcmd.c \
-usb.c usb-uhci.c usb-ohci.c usb-ehci.c usb-hid.c usb-msc.c
+usb.c usb-uhci.c usb-ohci.c usb-ehci.c usb-hid.c usb-msc.c iommu.c
 SRC16=$(SRCBOTH) system.c disk.c apm.c font.c
 SRC32FLAT=$(SRCBOTH) post.c shadow.c memmap.c coreboot.c boot.c \
   acpi.c smm.c mptable.c smbios.c pciinit.c optionroms.c mtrr.c \
diff --git a/src/acpi.c b/src/acpi.c
index 0559443..7ea9c55 100644
--- a/src/acpi.c
+++ b/src/acpi.c
@@ -6,6 +6,7 @@
 // This file may be distributed under the terms of the GNU LGPLv3 license.
 
 #include acpi.h // struct rsdp_descriptor
+#include iommu.h
 #include util.h // memcpy
 #include pci.h // pci_find_device
 #include biosvar.h // GET_EBDA
@@ -268,6 +269,36 @@ struct srat_memory_affinity
 u32reserved3[2];
 } PACKED;
 
+/*
+ * IVRS (I/O Virtualization Reporting Structure) table.
+ *
+ * Describes the AMD IOMMU, as per:
+ * AMD I/O Virtualization Technology (IOMMU) Specification, rev 1.26
+ */
+
+struct ivrs_ivhd
+{
+u8type;
+u8flags;
+u16   length;
+u16   devid;
+u16   capab_off;
+u32   iommu_base_low;
+u32   iommu_base_high;
+u16   pci_seg_group;
+u16   iommu_info;
+u32   reserved;
+u8entry[0];
+} PACKED;
+
+struct ivrs_table
+{
+ACPI_TABLE_HEADER_DEF/* ACPI common table header. */
+u32iv_info;
+u32reserved[2];
+struct ivrs_ivhd   ivhd;
+} PACKED;
+
 #include acpi-dsdt.hex
 
 static inline u16 cpu_to_le16(u16 x)
@@ -599,6 +630,53 @@ build_srat(void)
 return srat;
 }
 
+#define IVRS_SIGNATURE 0x53525649 // IVRS
+#define IVRS_MAX_DEVS  32
+static void *
+build_ivrs(void)
+{
+int iommu_bdf, bdf, max, i;
+struct ivrs_table *ivrs;
+struct ivrs_ivhd *ivhd;
+
+iommu_bdf = pci_find_class(PCI_CLASS_SYSTEM_IOMMU);
+if (iommu_bdf  0)
+return NULL;
+
+ivrs = malloc_high(sizeof(struct ivrs_table) + 4 * IVRS_MAX_DEVS);
+ivrs-iv_info = iommu_get_misc()  ~0x000F;
+
+ivhd = ivrs-ivhd;
+ivhd-type  = 0x10;
+ivhd-flags = 0;
+ivhd-length= sizeof(struct ivrs_ivhd);
+ivhd-devid = iommu_get_bdf();
+ivhd-capab_off = iommu_get_cap_offset();
+ivhd-iommu_base_low= iommu_get_base();
+ivhd-iommu_base_high   = 0;
+ivhd-pci_seg_group = 0;
+ivhd-iommu_info= 0;
+ivhd-reserved  = 0;
+
+i = 0;
+foreachpci(bdf, max) {
+if (bdf == ivhd-devid)
+continue;
+ivhd-entry[4 * i + 0] = 2;
+ivhd-entry[4 * i + 1] = bdf  0xFF;
+ivhd-entry[4 * i + 2] = (bdf  8)  0xFF;
+ivhd-entry[4 * i + 3] = ~(1  3);
+ivhd-length += 4;
+if (++i = IVRS_MAX_DEVS)
+break;
+}
+
+build_header((void *) ivrs, IVRS_SIGNATURE,
+ sizeof(struct ivrs_table) + 4 * i, 1);
+
+return ivrs;
+}
+
 struct rsdp_descriptor *RsdpAddr;
 
 #define MAX_ACPI_TABLES 20
@@ -639,6 +717,7 @@ acpi_bios_init(void)
 ACPI_INIT_TABLE(build_madt());
 ACPI_INIT_TABLE(build_hpet());
 ACPI_INIT_TABLE(build_srat());
+ACPI_INIT_TABLE(build_ivrs());
 
 u16 i, external_tables = qemu_cfg_acpi_additional_tables();
 
diff --git a/src/iommu.c b/src/iommu.c
new file mode 100644
index 000..97af24a
--- /dev/null
+++ b/src/iommu.c
@@ -0,0 +1,64 @@
+// AMD IOMMU initialization code.
+//
+// Copyright (C) 2010  Eduard - Gabriel Munteanu eduard.munte...@linux360.ro
+//
+// This file may be distributed under the terms of the GNU LGPLv3 license.
+
+#include iommu.h
+#include pci.h
+#include types.h
+
+#define IOMMU_CAP_BAR_LOW   0x04
+#define IOMMU_CAP_BAR_HIGH  0x08
+#define IOMMU_CAP_RANGE 0x0C
+#define IOMMU_CAP_MISC  0x10
+
+static int iommu_bdf = -1;
+static u8 iommu_cap_offset;
+static u32 iommu_base;
+
+void iommu_init(int bdf, u32 base)
+{
+u8 ptr, cap, type;
+
+/* Only one IOMMU is supported. */
+if (iommu_bdf = 0)
+return;
+
+foreachcap(bdf, ptr, cap) {
+type = pci_config_readb(bdf, cap);
+if (type == 

Re: [Qemu-devel] [PATCH 7/7] ac97: use the PCI memory access interface

2010-08-15 Thread malc
On Sun, 15 Aug 2010, Eduard - Gabriel Munteanu wrote:

 This allows the device to work properly with an emulated IOMMU.

Fine with me.

[..snip..]

-- 
mailto:av1...@comtv.ru
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Relationship between libkvm and qemu-kvm.c

2010-08-15 Thread SHEN Hao
Hello, everyone,

I am a little bit confusing with the qemu-kvm project in which I found
some similar code in both libkvm and qemu-kvm.c. Is the libkvm really
used by qemu? What's the relationship
between them?

Best regards,
-- 
Hao Shen
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 0/4] Real mode interrupt injection

2010-08-15 Thread Mohammed Gamal
This patch introduces real mode interrupt injection for VMX.
It currently invokes the x86 emulator to emulate interrupts
instead of manually setting VMX controls.

Needless to say, this is not meant for merging in its current state.
The emulator still needs some more work to get this completely operational.

Mohammed Gamal (4):
  x86 emulator: Expose emulate_int_real()
  x86: Separate emulation context initialization in a separate function
  x86: Add kvm_inject_realmode_interrupt() wrapper
  VMX: Emulated real mode interrupt injection

 arch/x86/include/asm/kvm_emulate.h |3 +-
 arch/x86/kvm/vmx.c |   65 +++
 arch/x86/kvm/x86.c |   75 ++--
 arch/x86/kvm/x86.h |1 +
 4 files changed, 55 insertions(+), 89 deletions(-)
---
Changes since v2:
- Refactored emulation context initialization code
- Commit eip value from the decode cache to the emulation context in x86.c 
rather than the emulator
- Add kvm_* prefix to inject_realmode_interrupt() global symbol for consistency
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 1/4] x86 emulator: Expose emulate_int_real()

2010-08-15 Thread Mohammed Gamal
Signed-off-by: Mohammed Gamal m.gamal...@gmail.com
---
 arch/x86/include/asm/kvm_emulate.h |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/kvm_emulate.h 
b/arch/x86/include/asm/kvm_emulate.h
index f22e5da..6a7cce0 100644
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -255,5 +255,6 @@ int x86_emulate_insn(struct x86_emulate_ctxt *ctxt);
 int emulator_task_switch(struct x86_emulate_ctxt *ctxt,
 u16 tss_selector, int reason,
 bool has_error_code, u32 error_code);
-
+int emulate_int_real(struct x86_emulate_ctxt *ctxt,
+struct x86_emulate_ops *ops, int irq);
 #endif /* _ASM_X86_KVM_X86_EMULATE_H */
-- 
1.7.0.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 2/4] x86: Separate emulation context initialization in a separate function

2010-08-15 Thread Mohammed Gamal
The code for initializing the emulation context is duplicated at two
locations (emulate_instruction() and kvm_task_switch()). Separate it
in a separate function and call it from there.

Signed-off-by: Mohammed Gamal m.gamal...@gmail.com
---
 arch/x86/kvm/x86.c |   54 ---
 1 files changed, 25 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1722d37..f24e594 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3936,6 +3936,28 @@ static void inject_emulated_exception(struct kvm_vcpu 
*vcpu)
kvm_queue_exception(vcpu, ctxt-exception);
 }
 
+static void init_emulate_ctxt(struct kvm_vcpu *vcpu)
+{
+   struct decode_cache *c = vcpu-arch.emulate_ctxt.decode;
+   int cs_db, cs_l;
+
+   cache_all_regs(vcpu);
+
+   kvm_x86_ops-get_cs_db_l_bits(vcpu, cs_db, cs_l);
+
+   vcpu-arch.emulate_ctxt.vcpu = vcpu;
+   vcpu-arch.emulate_ctxt.eflags = kvm_x86_ops-get_rflags(vcpu);
+   vcpu-arch.emulate_ctxt.eip = kvm_rip_read(vcpu);
+   vcpu-arch.emulate_ctxt.mode =
+   (!is_protmode(vcpu)) ? X86EMUL_MODE_REAL :
+   (vcpu-arch.emulate_ctxt.eflags  X86_EFLAGS_VM)
+   ? X86EMUL_MODE_VM86 : cs_l
+   ? X86EMUL_MODE_PROT64 : cs_db
+   ? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16;
+   memset(c, 0, sizeof(struct decode_cache));
+   memcpy(c-regs, vcpu-arch.regs, sizeof c-regs);
+}
+
 static int handle_emulation_failure(struct kvm_vcpu *vcpu)
 {
++vcpu-stat.insn_emulation_fail;
@@ -3992,20 +4014,7 @@ int emulate_instruction(struct kvm_vcpu *vcpu,
cache_all_regs(vcpu);
 
if (!(emulation_type  EMULTYPE_NO_DECODE)) {
-   int cs_db, cs_l;
-   kvm_x86_ops-get_cs_db_l_bits(vcpu, cs_db, cs_l);
-
-   vcpu-arch.emulate_ctxt.vcpu = vcpu;
-   vcpu-arch.emulate_ctxt.eflags = kvm_x86_ops-get_rflags(vcpu);
-   vcpu-arch.emulate_ctxt.eip = kvm_rip_read(vcpu);
-   vcpu-arch.emulate_ctxt.mode =
-   (!is_protmode(vcpu)) ? X86EMUL_MODE_REAL :
-   (vcpu-arch.emulate_ctxt.eflags  X86_EFLAGS_VM)
-   ? X86EMUL_MODE_VM86 : cs_l
-   ? X86EMUL_MODE_PROT64 : cs_db
-   ? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16;
-   memset(c, 0, sizeof(struct decode_cache));
-   memcpy(c-regs, vcpu-arch.regs, sizeof c-regs);
+   init_emulate_ctxt(vcpu);
vcpu-arch.emulate_ctxt.interruptibility = 0;
vcpu-arch.emulate_ctxt.exception = -1;
vcpu-arch.emulate_ctxt.perm_ok = false;
@@ -5064,22 +5073,9 @@ int kvm_task_switch(struct kvm_vcpu *vcpu, u16 
tss_selector, int reason,
bool has_error_code, u32 error_code)
 {
struct decode_cache *c = vcpu-arch.emulate_ctxt.decode;
-   int cs_db, cs_l, ret;
-   cache_all_regs(vcpu);
-
-   kvm_x86_ops-get_cs_db_l_bits(vcpu, cs_db, cs_l);
+   int ret;
 
-   vcpu-arch.emulate_ctxt.vcpu = vcpu;
-   vcpu-arch.emulate_ctxt.eflags = kvm_x86_ops-get_rflags(vcpu);
-   vcpu-arch.emulate_ctxt.eip = kvm_rip_read(vcpu);
-   vcpu-arch.emulate_ctxt.mode =
-   (!is_protmode(vcpu)) ? X86EMUL_MODE_REAL :
-   (vcpu-arch.emulate_ctxt.eflags  X86_EFLAGS_VM)
-   ? X86EMUL_MODE_VM86 : cs_l
-   ? X86EMUL_MODE_PROT64 : cs_db
-   ? X86EMUL_MODE_PROT32 : X86EMUL_MODE_PROT16;
-   memset(c, 0, sizeof(struct decode_cache));
-   memcpy(c-regs, vcpu-arch.regs, sizeof c-regs);
+   init_emulate_ctxt(vcpu);
 
ret = emulator_task_switch(vcpu-arch.emulate_ctxt,
   tss_selector, reason, has_error_code,
-- 
1.7.0.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 3/4] x86: Add kvm_inject_realmode_interrupt() wrapper

2010-08-15 Thread Mohammed Gamal
This adds a wrapper function kvm_inject_realmode_interrupt() around the
emulator function emulate_int_real() to allow real mode interrupt injection.

Signed-off-by: Mohammed Gamal m.gamal...@gmail.com
---
 arch/x86/kvm/x86.c |   21 +
 arch/x86/kvm/x86.h |1 +
 2 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f24e594..59b708c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3958,6 +3958,27 @@ static void init_emulate_ctxt(struct kvm_vcpu *vcpu)
memcpy(c-regs, vcpu-arch.regs, sizeof c-regs);
 }
 
+int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq)
+{   
+   struct decode_cache *c = vcpu-arch.emulate_ctxt.decode;
+   int ret;
+
+   init_emulate_ctxt(vcpu);
+
+   ret = emulate_int_real(vcpu-arch.emulate_ctxt, emulate_ops, irq);
+
+   if (ret != X86EMUL_CONTINUE)
+   return EMULATE_FAIL;
+
+   vcpu-arch.emulate_ctxt.eip = c-eip;
+   memcpy(vcpu-arch.regs, c-regs, sizeof c-regs);
+   kvm_rip_write(vcpu, vcpu-arch.emulate_ctxt.eip);
+   kvm_x86_ops-set_rflags(vcpu, vcpu-arch.emulate_ctxt.eflags);
+
+   return EMULATE_DONE;
+}
+EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt);
+
 static int handle_emulation_failure(struct kvm_vcpu *vcpu)
 {
++vcpu-stat.insn_emulation_fail;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index b7a4047..8b83da5 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -67,5 +67,6 @@ static inline int is_paging(struct kvm_vcpu *vcpu)
 
 void kvm_before_handle_nmi(struct kvm_vcpu *vcpu);
 void kvm_after_handle_nmi(struct kvm_vcpu *vcpu);
+int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq);
 
 #endif
-- 
1.7.0.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC PATCH v3 4/4] VMX: Emulated real mode interrupt injection

2010-08-15 Thread Mohammed Gamal
Signed-off-by: Mohammed Gamal m.gamal...@gmail.com
---
 arch/x86/kvm/vmx.c |   65 ---
 1 files changed, 6 insertions(+), 59 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 652d317..0f9e3e4 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -155,11 +155,6 @@ struct vcpu_vmx {
u32 limit;
u32 ar;
} tr, es, ds, fs, gs;
-   struct {
-   bool pending;
-   u8 vector;
-   unsigned rip;
-   } irq;
} rmode;
int vpid;
bool emulation_required;
@@ -1048,16 +1043,8 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, 
unsigned nr,
}
 
if (vmx-rmode.vm86_active) {
-   vmx-rmode.irq.pending = true;
-   vmx-rmode.irq.vector = nr;
-   vmx-rmode.irq.rip = kvm_rip_read(vcpu);
-   if (kvm_exception_is_soft(nr))
-   vmx-rmode.irq.rip +=
-   vmx-vcpu.arch.event_exit_inst_len;
-   intr_info |= INTR_TYPE_SOFT_INTR;
-   vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info);
-   vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, 1);
-   kvm_rip_write(vcpu, vmx-rmode.irq.rip - 1);
+   if (kvm_inject_realmode_interrupt(vcpu, nr) != EMULATE_DONE)
+   kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
return;
}
 
@@ -2838,16 +2825,8 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu)
 
++vcpu-stat.irq_injections;
if (vmx-rmode.vm86_active) {
-   vmx-rmode.irq.pending = true;
-   vmx-rmode.irq.vector = irq;
-   vmx-rmode.irq.rip = kvm_rip_read(vcpu);
-   if (vcpu-arch.interrupt.soft)
-   vmx-rmode.irq.rip +=
-   vmx-vcpu.arch.event_exit_inst_len;
-   vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
-irq | INTR_TYPE_SOFT_INTR | INTR_INFO_VALID_MASK);
-   vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, 1);
-   kvm_rip_write(vcpu, vmx-rmode.irq.rip - 1);
+   if (kvm_inject_realmode_interrupt(vcpu, irq) != EMULATE_DONE)
+   kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
return;
}
intr = irq | INTR_INFO_VALID_MASK;
@@ -2879,14 +2858,8 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
 
++vcpu-stat.nmi_injections;
if (vmx-rmode.vm86_active) {
-   vmx-rmode.irq.pending = true;
-   vmx-rmode.irq.vector = NMI_VECTOR;
-   vmx-rmode.irq.rip = kvm_rip_read(vcpu);
-   vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
-NMI_VECTOR | INTR_TYPE_SOFT_INTR |
-INTR_INFO_VALID_MASK);
-   vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, 1);
-   kvm_rip_write(vcpu, vmx-rmode.irq.rip - 1);
+   if (kvm_inject_realmode_interrupt(vcpu, NMI_VECTOR) != 
EMULATE_DONE)
+   kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
return;
}
vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
@@ -3848,29 +3821,6 @@ static void vmx_recover_nmi_blocking(struct vcpu_vmx 
*vmx)
ktime_to_ns(ktime_sub(ktime_get(), vmx-entry_time));
 }
 
-/*
- * Failure to inject an interrupt should give us the information
- * in IDT_VECTORING_INFO_FIELD.  However, if the failure occurs
- * when fetching the interrupt redirection bitmap in the real-mode
- * tss, this doesn't happen.  So we do it ourselves.
- */
-static void fixup_rmode_irq(struct vcpu_vmx *vmx, u32 *idt_vectoring_info)
-{
-   vmx-rmode.irq.pending = 0;
-   if (kvm_rip_read(vmx-vcpu) + 1 != vmx-rmode.irq.rip)
-   return;
-   kvm_rip_write(vmx-vcpu, vmx-rmode.irq.rip);
-   if (*idt_vectoring_info  VECTORING_INFO_VALID_MASK) {
-   *idt_vectoring_info = ~VECTORING_INFO_TYPE_MASK;
-   *idt_vectoring_info |= INTR_TYPE_EXT_INTR;
-   return;
-   }
-   *idt_vectoring_info =
-   VECTORING_INFO_VALID_MASK
-   | INTR_TYPE_EXT_INTR
-   | vmx-rmode.irq.vector;
-}
-
 static void __vmx_complete_interrupts(struct vcpu_vmx *vmx,
  u32 idt_vectoring_info,
  int instr_len_field,
@@ -3880,9 +3830,6 @@ static void __vmx_complete_interrupts(struct vcpu_vmx 
*vmx,
int type;
bool idtv_info_valid;
 
-   if (vmx-rmode.irq.pending)
-   fixup_rmode_irq(vmx, idt_vectoring_info);
-
idtv_info_valid = idt_vectoring_info  VECTORING_INFO_VALID_MASK;
 
vmx-vcpu.arch.nmi_injected = false;
-- 
1.7.0.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to 

Re: [RFC PATCH v3 0/4] Real mode interrupt injection

2010-08-15 Thread Mohammed Gamal
On Mon, Aug 16, 2010 at 12:46 AM, Mohammed Gamal m.gamal...@gmail.com wrote:
 This patch introduces real mode interrupt injection for VMX.
 It currently invokes the x86 emulator to emulate interrupts
 instead of manually setting VMX controls.

 Needless to say, this is not meant for merging in its current state.
 The emulator still needs some more work to get this completely operational.

 Mohammed Gamal (4):
  x86 emulator: Expose emulate_int_real()
  x86: Separate emulation context initialization in a separate function
  x86: Add kvm_inject_realmode_interrupt() wrapper
  VMX: Emulated real mode interrupt injection

  arch/x86/include/asm/kvm_emulate.h |    3 +-
  arch/x86/kvm/vmx.c                 |   65 +++
  arch/x86/kvm/x86.c                 |   75 
 ++--
  arch/x86/kvm/x86.h                 |    1 +
  4 files changed, 55 insertions(+), 89 deletions(-)
 ---
 Changes since v2:
 - Refactored emulation context initialization code
 - Commit eip value from the decode cache to the emulation context in x86.c 
 rather than the emulator
 - Add kvm_* prefix to inject_realmode_interrupt() global symbol for 
 consistency


Here is a full trace of a MINIX guest since bootup. Looks like we get
stuck somewhere in the BIOS.
https://docs.google.com/leaf?id=0B9UodZT1IuENMzJhNWQxM2YtYzE3YS00YWY4LTk2YTgtZWY3ODNhMWUxMDkxsort=namelayout=listnum=50
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: KSM with Debian GNU/Linux

2010-08-15 Thread Daniel Bareiro
Hi, all!

On Thursday, 12 August 2010 22:05:34 -0300,
Daniel Bareiro wrote:

 Keeping the kernel I had compiled and installing the qemu-kvm package
 in Backports, now KSM is working:
 
 # cat /sys/kernel/mm/ksm/pages_sharing
 181406

Looking at the statistics of the values obtained running 15 virtual
machines totaling 10.7 GB on a 4 GB VMHost, I get the following, which
is a very interesting memory savings:

# for ii in /sys/kernel/mm/ksm/* ; do echo -n $ii:  ; cat $ii ; done
/sys/kernel/mm/ksm/full_scans: 4114
/sys/kernel/mm/ksm/max_kernel_pages: 253500
/sys/kernel/mm/ksm/pages_shared: 67064
/sys/kernel/mm/ksm/pages_sharing: 510990
/sys/kernel/mm/ksm/pages_to_scan: 100
/sys/kernel/mm/ksm/pages_unshared: 448079
/sys/kernel/mm/ksm/pages_volatile: 13595
/sys/kernel/mm/ksm/run: 1
/sys/kernel/mm/ksm/sleep_millisecs: 20

# free
 total   used   free sharedbuffers cached
Mem:   405646825787281477740  0   3736  62156
-/+ buffers/cache:25128361543632
Swap:   497848  25972 471876


Some recommendation about tunning of KSM?

I've no very clear about the difference between page_shared and
page_sharing. Somebody could clarify it?

Thanks for your reply.

Regards,
Daniel
-- 
Fingerprint: BFB3 08D6 B4D1 31B2 72B9  29CE 6696 BF1B 14E6 1D37
Powered by Debian GNU/Linux Lenny - Linux user #188.598


signature.asc
Description: Digital signature


RE: [qemu-kvm] build fail on i386 RHEL5u4

2010-08-15 Thread Hao, Xudong
Avi Kivity wrote:
   On 08/11/2010 04:49 AM, Hao, Xudong wrote:
 Hi,
 Recently I build qemu-kvm on 32bit RHEL5u4/RHEL5u5, it will fail on
 fuction vhost_dev_sync_region. But RHEL5u1 system is fine to
 build. Did anyone meet similar issue? 
 
 qemu-kvm commit: 59d71ddb432db04b57ee2658ce50a3e35d7db97e
 
 build error:
 ...
CCx86_64-softmmu/i8254.o
CCx86_64-softmmu/i8254-kvm.o
CCx86_64-softmmu/device-assignment.o
LINK  x86_64-softmmu/qemu-system-x86_64
 vhost.o: In function `vhost_dev_sync_region':
 /home/source/qemu-kvm/hw/vhost.c:47: undefined reference to
 `__sync_fetch_and_and_4' 
 collect2: ld returned 1 exit status
 make[1]: *** [qemu-system-x86_64] Error 1
 make: *** [subdir-x86_64-softmmu] Error 2
 
 
 Appears to be a gcc bug.  I opened
 https://bugzilla.redhat.com/show_bug.cgi?id=624279 to track this.
 
 Meanwhile, installing the gcc44 package and building with it
 (./configure --cc=gcc44) appears to work.

Avi,
Gcc44 works for me.
I saw Jakub marked this bug closed with only i486 support that, but RHEL5 use 
-march=i386, so do we have ongoing fix on qemu-kvm? 

Thanks,
Xudong--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 3/3] KVM: MMU: prefetch ptes when intercepted guest #PF

2010-08-15 Thread Xiao Guangrong
Hi Marcelo,

Thanks for your review and sorry for the delay reply.

Marcelo Tosatti wrote:

 +static struct kvm_memory_slot *
 +pte_prefetch_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn, bool 
 no_dirty_log)
 +{
 +struct kvm_memory_slot *slot;
 +
 +slot = gfn_to_memslot(vcpu-kvm, gfn);
 +if (!slot || slot-flags  KVM_MEMSLOT_INVALID ||
 +  (no_dirty_log  slot-dirty_bitmap))
 +slot = NULL;
 
 Why is this no_dirty_log optimization worthwhile?
 

We disable prefetch the writable pages since 'pte prefetch' will hurt slot's
dirty page tracking that it set the dirty_bitmap bit but the corresponding page
is not really accessed.

 +
 +return slot;
 +}
 +
 +static pfn_t pte_prefetch_gfn_to_pfn(struct kvm_vcpu *vcpu, gfn_t gfn,
 + bool no_dirty_log)
 +{
 +struct kvm_memory_slot *slot;
 +unsigned long hva;
 +
 +slot = pte_prefetch_gfn_to_memslot(vcpu, gfn, no_dirty_log);
 +if (!slot) {
 +get_page(bad_page);
 +return page_to_pfn(bad_page);
 +}
 +
 +hva = gfn_to_hva_memslot(slot, gfn);
 +
 +return hva_to_pfn_atomic(vcpu-kvm, hva);
 +}
 +
 +static int direct_pte_prefetch_many(struct kvm_vcpu *vcpu,
 +struct kvm_mmu_page *sp,
 +u64 *start, u64 *end)
 +{
 +struct page *pages[PTE_PREFETCH_NUM];
 +struct kvm_memory_slot *slot;
 +unsigned hva, access = sp-role.access;
 +int i, ret, npages = end - start;
 +gfn_t gfn;
 +
 +gfn = kvm_mmu_page_get_gfn(sp, start - sp-spt);
 +slot = pte_prefetch_gfn_to_memslot(vcpu, gfn, access  ACC_WRITE_MASK);
 +if (!slot || slot-npages - (gfn - slot-base_gfn) != npages)
 +return -1;
 +
 +hva = gfn_to_hva_memslot(slot, gfn);
 +ret = __get_user_pages_fast(hva, npages, 1, pages);
 +if (ret = 0)
 +return -1;
 
 Better do one at a time with hva_to_pfn_atomic. Or, if you measure that
 its worthwhile, do on a separate patch (using a helper as discussed
 previously).
 

Since it should disable 'prefetch' for the writable pages, so i'm not put these
operations into a common function and define it in kvm_main.c file.

Maybe we do better do these in a wrap function named 
pte_prefetch_gfn_to_pages()?

 @@ -302,14 +303,87 @@ static void FNAME(update_pte)(struct kvm_vcpu *vcpu, 
 struct kvm_mmu_page *sp,
  static bool FNAME(gpte_changed)(struct kvm_vcpu *vcpu,
  struct guest_walker *gw, int level)
  {
 -int r;
  pt_element_t curr_pte;
 -
 -r = kvm_read_guest_atomic(vcpu-kvm, gw-pte_gpa[level - 1],
 +gpa_t base_gpa, pte_gpa = gw-pte_gpa[level - 1];
 +u64 mask;
 +int r, index;
 +
 +if (level == PT_PAGE_TABLE_LEVEL) {
 +mask = PTE_PREFETCH_NUM * sizeof(pt_element_t) - 1;
 +base_gpa = pte_gpa  ~mask;
 +index = (pte_gpa - base_gpa) / sizeof(pt_element_t);
 +
 +r = kvm_read_guest_atomic(vcpu-kvm, base_gpa,
 +gw-prefetch_ptes, sizeof(gw-prefetch_ptes));
 +curr_pte = gw-prefetch_ptes[index];
 
 This can slowdown a single non-prefetchable pte fault. Maybe its
 irrelevant, but please have kvm_read_guest_atomic in the first patch and
 then later optimize, its easier to review and bisectable.
 

OK, i'll separate it.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] AMD IOMMU emulation patches v3

2010-08-15 Thread Anthony Liguori

On 08/15/2010 02:27 PM, Eduard - Gabriel Munteanu wrote:

Hi,

Please have a look at these and merge if you wish. I hope I've addressed the
issues people have raised.
   


It's looking pretty good so far.  I'm very happy with the modifications 
to the PCI layer.


It looks like given the helpers that you've added, converting the PCI 
devices is more or less programmatic.  IOW, it just requires an 
appropriate sed.


I'd rather see an all-at-once conversion of the PCI devices than just 
convert over a couple functions.  In fact, we can go a step further 
after that and start poisoning symbols to prevent the wrong interfaces 
from being used.


Regards,

Anthony Liguori


Some changes from the previous RFC:
- included and updated the other two device patches
- moved map registration and invalidation management into PCI code
- AMD IOMMU emulation is always enabled (no more configure options)
- cleaned up code, I now use typedefs as suggested
- event logging cleanups

BTW, the change to pci_regs.h is properly aligned but the original file contains
tabs.


 Cheers,
 Eduard

Eduard - Gabriel Munteanu (7):
   pci: add range_covers_range()
   pci: memory access API and IOMMU support
   AMD IOMMU emulation
   ide: use the PCI memory access interface
   rtl8139: use the PCI memory access interface
   eepro100: use the PCI memory access interface
   ac97: use the PCI memory access interface

  Makefile.target   |2 +
  dma-helpers.c |   46 -
  dma.h |   21 ++-
  hw/ac97.c |6 +-
  hw/amd_iommu.c|  688 +
  hw/eepro100.c |   78 ---
  hw/ide/core.c |   15 +-
  hw/ide/internal.h |   39 +++
  hw/ide/pci.c  |7 +
  hw/pc.c   |2 +
  hw/pci.c  |  197 +++-
  hw/pci.h  |   84 +++
  hw/pci_ids.h  |2 +
  hw/pci_regs.h |1 +
  hw/rtl8139.c  |   99 +
  qemu-common.h |1 +
  16 files changed, 1191 insertions(+), 97 deletions(-)
  create mode 100644 hw/amd_iommu.c

   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] KVM: x86 emulator: put register operand write back to a function

2010-08-15 Thread Wei Yongjun

  On 08/12/2010 04:38 PM, Wei Yongjun wrote:
   
 Introduce function write_register_operand() to write back the
 register operand.


  
 +static void write_register_operand(struct operand *op, unsigned long val,
 +   unsigned int bytes)
 +{
 +/* The 4-byte case *is* correct: in 64-bit mode we zero-extend. */
 +switch (bytes) {
 +case 1:
 +*(u8 *)op-addr.reg = (u8)val;
 +break;
 +case 2:
 +*(u16 *)op-addr.reg = (u16)val;
 +break;
 +case 4:
 +*op-addr.reg = (u32)val;
 +break;  /* 64b: zero-extend */
 +case 8:
 +*op-addr.reg = val;
 +break;
 +}
 +}
 
 It's cleaner to take val and bytes from struct operand, and do the
 assignment from the callers, no?
   

take val and bytes from struct operand may have other issue, when we
writeback
the source register, we need do the assignment from the caller, and then
change
the val back before write src val to dst val. Such as xadd:
c-src.val = c-dst.val;
write_register_operand(c-src);
c-src.val = c-src.orig_val;
goto add;




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kvm: fix poison overwritten caused by using wrong xstate size

2010-08-15 Thread H. Peter Anvin
Feel free to add my ack.

Avi Kivity a...@redhat.com wrote:

  On 08/14/2010 12:03 AM, H. Peter Anvin wrote:
 Avi, do you want to take this one or should I?

I will, thanks.

-- 
error compiling committee.c: too many arguments to function


-- 
Sent from my mobile phone.  Please pardon any lack of formatting.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html