Re: [kvm-devel] pinning, tsc and apic

2008-05-15 Thread Chris Wright
* Anthony Liguori ([EMAIL PROTECTED]) wrote:
  From a quick look, I suspect that the number of wildly off TSC 
 calibrations correspond to the VMs that are misbehaving.  I think this 
 may mean that we have to re-examine the tsc delta computation.
 
 10_serial.log:time.c: Detected 1995.038 MHz processor.
 11_serial.log:time.c: Detected 2363.195 MHz processor.
 12_serial.log:time.c: Detected 2492.675 MHz processor.
 13_serial.log:time.c: Detected 1995.061 MHz processor.
 14_serial.log:time.c: Detected 1994.917 MHz processor.
 15_serial.log:time.c: Detected 4100.735 MHz processor.
 16_serial.log:time.c: Detected 2075.800 MHz processor.
 17_serial.log:time.c: Detected 2674.350 MHz processor.
 18_serial.log:time.c: Detected 1995.002 MHz processor.
 19_serial.log:time.c: Detected 1994.978 MHz processor.
 1_serial.log:time.c: Detected 4384.310 MHz processor.

Is this with pinning?  We at least know we're losing small bits on
migration.  From my measurements it's ~3000 (outliers are 10-20k).

Also, what happens if you roll back to kvm-userspace 7f5c4d15ece5?

I'm using this:

diff -up arch/x86/kvm/svm.c~svm arch/x86/kvm/svm.c
--- arch/x86/kvm/svm.c~svm  2008-04-16 19:49:44.0 -0700
+++ arch/x86/kvm/svm.c  2008-05-14 23:44:18.0 -0700
@@ -621,6 +621,13 @@ static void svm_free_vcpu(struct kvm_vcp
kmem_cache_free(kvm_vcpu_cache, svm);
 }
 
+static void svm_tsc_update(void *arg)
+{
+   struct vcpu_svm *svm = arg;
+   rdtscll(svm-vcpu.arch.host_tsc);
+
+}
+
 static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
struct vcpu_svm *svm = to_svm(vcpu);
@@ -633,6 +640,9 @@ static void svm_vcpu_load(struct kvm_vcp
 * Make sure that the guest sees a monotonically
 * increasing TSC.
 */
+   if (vcpu-cpu != -1)
+   smp_call_function_single(vcpu-cpu, svm_tsc_update,
+svm, 0, 1);
rdtscll(tsc_this);
delta = vcpu-arch.host_tsc - tsc_this;
svm-vmcb-control.tsc_offset += delta;


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ANNOUNCE] kvm-guest-drivers-windows-2

2008-05-15 Thread Dor Laor

On Wed, 2008-05-14 at 23:09 +0200, Tomasz Chmielewski wrote:
 Anthony Liguori schrieb:
 
 (...)
 
  So, a PV network driver can do about 700Mb/s, and an emulated NIC can 
  do about 600 Mb/s, Windows guest to host?
 
  That would be about 20% improvement?
  
  
  FWIW, virtio-net is much better with my patches applied.  The difference 
  between the e1000 and virtio-net is that e1000 consumes almost twice as 
  much CPU as virtio-net so in my testing, the performance improvement 
  with virtio-net is about 2x.  We were loosing about 20-30% throughput 
  because of the delays in handling incoming packets.
 
 Do you by chance have any recent numbers on disk performance (i.e., Windows 
 guest vs Linux host)?
 
 

At the moment there is no pv block driver for Windows guests. (there is
for linux)
You can use scsi for windows, it should perform well.


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Protected mode transitions and big real mode... still an issue

2008-05-15 Thread Avi Kivity
Marcelo Tosatti wrote:
 1) add is storing the result in the wrong register

 6486:   66 64 89 3e 72 01   mov%edi,%fs:0x172
 648c:   66 be 8d 03 00 00   mov$0x38d,%esi
 6492:   66 c1 e6 04 shl$0x4,%esi
 6496:   66 b8 98 0a 00 00   mov$0xa98,%eax
 649c:   66 03 f0add%eax,%esi

 The destination for the add is %esi, but the emulation stores the 
 result in eax, because:

 if ((c-d  ModRM)  c-modrm_mod == 3) {
 u8 reg;
 c-dst.bytes = (c-d  ByteOp) ? 1 : c-op_bytes;
 c-dst.ptr = decode_register(c-modrm_rm, c-regs, 
 c-d  ByteOp);
 }

 modrm_reg contains 6, which is the correct register index, but
 modrm_rm contains 0, so the result is stored in eax (see hack).
   

What version are you looking at?  Current code doesn't have exactly this.

But register-in-modrm decoding is a mess, yes.  I think the best thing 
is to have decode_modrm() accept a struct operand parameter and decode 
into that.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] can't boot 2.6.26-rcX

2008-05-15 Thread Avi Kivity
Bernd Schubert wrote:
 Hello,

 there is a problem booting 2.6.26-rcX (X=1,2). It stops booting at

 Calibrating delay using timer specific routine.. 4016.92 BogoMIPS
 (lpj=8033846)

 The kvm process then takes 100% of my host CPU.

 This is with kvm-67 on an AM64-X2-

 I'm not yet familiar with kvm and debugging. Will a sysrq+t trace of the
 host show something useful? Or will only full git-bisect help?
   

Do you have CONFIG_KVM_GUEST or CONFIG_KVM_CLOCK in your config?  If so, 
this may be a paravirt problem.  Try turning them off and let us know.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [RFC] Reworking KVM_DEBUG_GUEST

2008-05-15 Thread Avi Kivity
Jan Kiszka wrote:
 Hi,

 before going wild with my idea, I would like to collect some comments on
 this approach:

 While doing first kernel debugging with my debug register patches for
 kvm, I quickly ran into the 4-breakpoints-only limitation that comes
 from the fact that we blindly map software to hardware breakpoints.
 Unhandy, simply suboptimal. Also, having 4 breakpoint slots hard-coded
 in the generic interface is not fair to arch that may support more.
 Moreover, we do not support watchpoints although this would easily be
 feasible. But if we supported watchpoints (via debug registers on x86),
 we would need the break out of the 4 slots limitations even earlier. In
 short, I came to the conclusion that a rewrite of the KVM_DEBUG_GUEST
 interface is required.
   

The current interface is limited, yes.


 Why do we set breakpoints in the kernel? Why not simply catching all
 debug traps, inserting software breakpoint ops into the guest code, and
 handling all this stuff as normal debuggers do? And the hardware
 breakpoints should just be pushed through the kernel interface like
 ptrace does.
   

The problem is that the breakpoints are visible to the guest.  If the 
guest swaps a page, the breakpoint will be swapped with it.  If it 
reallocates a page to a different use it will overwrite the breakpoint.  
It's very brittle.

For Linux kernel debugging these issues don't show up in practice, but 
other kernels are able to swap their own memory.

 The new KVM_DEBUG_GUEST interface I currently have in mind would look
 like this:

 #define KVM_DBGGUEST_ENABLE   0x01
 #define KVM_DBGGUEST_SINGLESTEP   0x02

 struct kvm_debug_guest {
   __u32 control;
   

[pad]

   struct kvm_debug_guest_arch arch;
 }
   

The guest debug inteface can probablty be 100% arch specific.

 Setting KVM_DBGGUEST_ENABLE would forward all debug-related traps to
 userspace first, which can then decide to handle or re-inject them.
 KVM_DBGGUEST_SINGLESTEP would work as before. And the extension for x86
 would look like this:

 struct kvm_debug_guest_arch {
   __u32 use_hw_breakpoints;
   

[pad]

   __u64 debugreg[8];
 }

 If use_hw_breakpoints is non-zero, KVM would completely overwrite the
 guest's debug registers with the content of debugreg, giving full
 control of this feature to the host-side debugger (faking the content of
 debug registers, effectively disabling them for the guest - as we now do
 all the time).
   

There's much more that can be done. Branch stepping, last branch 
recording, etc.

 Questions:
 - Does anyone see traps and pitfalls in this approach?
   

It seems workable, modulo the non-transparency of the debugger.

 - May I replace the existing interface with this one, or am I overseeing
   some use case that already worked with the current code so that ABI
   compatibility is required (most debug stuff should have been simply
   broken so far, also due to bugs in userland)?
   

This will break compilation of older userspace, so a new interface is 
preferred, complete with KVM_CAP_...

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-15 Thread Nick Piggin
On Wed, May 14, 2008 at 06:26:25AM -0500, Robin Holt wrote:
 On Wed, May 14, 2008 at 06:11:22AM +0200, Nick Piggin wrote:
  
  I guess that you have found a way to perform TLB flushing within coherent
  domains over the numalink interconnect without sleeping. I'm sure it would
  be possible to send similar messages between non coherent domains.
 
 I assume by coherent domains, your are actually talking about system
 images.

Yes

  Our memory coherence domain on the 3700 family is 512 processors
 on 128 nodes.  On the 4700 family, it is 16,384 processors on 4096 nodes.
 We extend a Read-Exclusive mode beyond the coherence domain so any
 processor is able to read any cacheline on the system.  We also provide
 uncached access for certain types of memory beyond the coherence domain.

Yes, I understand the basics.

 
 For the other partitions, the exporting partition does not know what
 virtual address the imported pages are mapped.  The pages are frequently
 mapped in a different order by the MPI library to help with MPI collective
 operations.
 
 For the exporting side to do those TLB flushes, we would need to replicate
 all that importing information back to the exporting side.

Right. Or the exporting side could be passed tokens that it tracks itself,
rather than virtual addresses.

 
 Additionally, the hardware that does the TLB flushing is protected
 by a spinlock on each system image.  We would need to change that
 simple spinlock into a type of hardware lock that would work (on 3700)
 outside the processors coherence domain.  The only way to do that is to
 use uncached addresses with our Atomic Memory Operations which do the
 cmpxchg at the memory controller.  The uncached accesses are an order
 of magnitude or more slower.

I'm not sure if you're thinking about what I'm thinking of. With the
scheme I'm imagining, all you will need is some way to raise an IPI-like
interrupt on the target domain. The IPI target will have a driver to
handle the interrupt, which will determine the mm and virtual addresses
which are to be invalidated, and will then tear down those page tables
and issue hardware TLB flushes within its domain. On the Linux side,
I don't see why this can't be done.

 
  So yes, I'd much rather rework such highly specialized system to fit in
  closer with Linux than rework Linux to fit with these machines (and
  apparently slow everyone else down).
 
 But it isn't that we are having a problem adapting to just the hardware.
 One of the limiting factors is Linux on the other partition.

In what way is the Linux limiting? 


   Additionally, the call to zap_page_range expects to have the mmap_sem
   held.  I suppose we could use something other than zap_page_range and
   atomically clear the process page tables.
  
  zap_page_range does not expect to have mmap_sem held. I think for anon
  pages it is always called with mmap_sem, however try_to_unmap_anon is
  not (although it expects page lock to be held, I think we should be able
  to avoid that).
 
 zap_page_range calls unmap_vmas which walks to vma-next.  Are you saying
 that can be walked without grabbing the mmap_sem at least readably?

Oh, I get that confused because of the mixed up naming conventions
there: unmap_page_range should actually be called zap_page_range. But
at any rate, yes we can easily zap pagetables without holding mmap_sem.


 I feel my understanding of list management and locking completely
 shifting.

FWIW, mmap_sem isn't held to protect vma-next there anyway, because at
that point the vmas are detached from the mm's rbtree and linked list.
But sure, in that particular path it is held for other reasons.

 
Doing that will not alleviate
   the need to sleep for the messaging to the other partitions.
  
  No, but I'd venture to guess that is not impossible to implement even
  on your current hardware (maybe a firmware update is needed)?
 
 Are you suggesting the sending side would not need to sleep or the
 receiving side?  Assuming you meant the sender, it spins waiting for the
 remote side to acknowledge the invalidate request?  We place the data
 into a previously agreed upon buffer and send an interrupt.  At this
 point, we would need to start spinning and waiting for completion.
 Let's assume we never run out of buffer space.

How would you run out of buffer space if it is synchronous?

 
 The receiving side receives an interrupt.  The interrupt currently wakes
 an XPC thread to do the work of transfering and delivering the message
 to XPMEM.  The transfer of the data which XPC does uses the BTE engine
 which takes up to 28 seconds to timeout (hardware timeout before raising
 and error) and the BTE code automatically does a retry for certain
 types of failure.  We currently need to grab semaphores which _MAY_
 be able to be reworked into other types of locks.

Sure, you obviously would need to rework your code because it's been
written with the assumption that it can sleep.

What is XPMEM exactly anyway? I'd 

Re: [kvm-devel] [Qemu-devel] Re: [PATCH] Add support for a configuration file

2008-05-15 Thread Avi Kivity
Daniel P. Berrange wrote:
 With this kind of syntax, now tools generating config files need to make
 up unique names for each drive. So you'll probably end up with them just
 naming things based on the class name + a number appended.
   

I would hope that tools don't have to resort to reading and writing 
these config files.  Usually a management system would prefer storing 
parameters in its own database, and writing a temporary config file just 
to pass the data seems awkward.  I would much prefer to see the command 
line and monitor retain full control over every configurable parameter.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Protected mode transitions and big real mode... still an issue

2008-05-15 Thread Guillaume Thouvenin
On Thu, 15 May 2008 10:33:38 +0300
Avi Kivity [EMAIL PROTECTED] wrote:

 Marcelo Tosatti wrote:
  1) add is storing the result in the wrong register
 
  6486:   66 64 89 3e 72 01   mov%edi,%fs:0x172
  648c:   66 be 8d 03 00 00   mov$0x38d,%esi
  6492:   66 c1 e6 04 shl$0x4,%esi
  6496:   66 b8 98 0a 00 00   mov$0xa98,%eax
  649c:   66 03 f0add%eax,%esi
 
  The destination for the add is %esi, but the emulation stores the 
  result in eax, because:
 
  if ((c-d  ModRM)  c-modrm_mod == 3) {
  u8 reg;
  c-dst.bytes = (c-d  ByteOp) ? 1 : c-op_bytes;
  c-dst.ptr = decode_register(c-modrm_rm, c-regs, 
  c-d  ByteOp);
  }
 
  modrm_reg contains 6, which is the correct register index, but
  modrm_rm contains 0, so the result is stored in eax (see hack).

 
 What version are you looking at?  Current code doesn't have exactly this.

It's in my patch. I added this because in gfxboot code there is an
instruction add %eax, %esp that needs to be emulated and with the
normal path, if I remember well, we have c-dst.bytes == 0 and thus,
the emulate_2op_SrcV() function just do nothing.

Regards,
Guillaume

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ANNOUNCE] kvm-guest-drivers-windows-2

2008-05-15 Thread Avi Kivity
Anthony Liguori wrote:
 FWIW, virtio-net is much better with my patches applied.

The can_receive patches?

Again, I'm not opposed to them in principle, I just think that if they 
help that this points at a virtio deficiency.  Virtio should never leave 
the rx queue empty.  Consider the case where the virtio queue isn't tied 
to a socket buffer, but directly to hardware.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ANNOUNCE] kvm-guest-drivers-windows-2

2008-05-15 Thread Tomasz Chmielewski
Dor Laor schrieb:

(...)

 FWIW, virtio-net is much better with my patches applied.  The difference 
 between the e1000 and virtio-net is that e1000 consumes almost twice as 
 much CPU as virtio-net so in my testing, the performance improvement 
 with virtio-net is about 2x.  We were loosing about 20-30% throughput 
 because of the delays in handling incoming packets.
 Do you by chance have any recent numbers on disk performance (i.e., Windows 
 guest vs Linux host)?


 
 At the moment there is no pv block driver for Windows guests. (there is
 for linux)
 You can use scsi for windows, it should perform well.

How well, when compared to bare metal? Or when compared to a Linux guest with 
a pv block driver? Do you have any numbers?


-- 
Tomasz Chmielewski
http://wpkg.org

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] performance with guests running 2.4 kernels (specifically RHEL3)

2008-05-15 Thread Avi Kivity
David S. Ahern wrote:
 Avi Kivity wrote:
   
 Not so fast...  the patch updates the flood count to 5.  Can you check
 if a lower value still works?  Also, whether updating the flood count to
 5 (without the rest of the patch) works?

 Unconditionally bumping the flood count to 5 will likely cause a
 performance regression on other guests.
 

 I put the flood count back to 3, and the RHEL3 guest performance is even
 better.

   

Okay, I committed the patch without the flood count == 5.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] can't boot 2.6.26-rcX

2008-05-15 Thread Avi Kivity
Bernd Schubert wrote:
 On Thursday 15 May 2008 09:36:41 Avi Kivity wrote:
   
 Bernd Schubert wrote:
 
 Hello,

 there is a problem booting 2.6.26-rcX (X=1,2). It stops booting at

 Calibrating delay using timer specific routine.. 4016.92 BogoMIPS
 (lpj=8033846)

 The kvm process then takes 100% of my host CPU.

 This is with kvm-67 on an AM64-X2-

 I'm not yet familiar with kvm and debugging. Will a sysrq+t trace of the
 host show something useful? Or will only full git-bisect help?
   
 Do you have CONFIG_KVM_GUEST or CONFIG_KVM_CLOCK in your config?  If so,
 this may be a paravirt problem.  Try turning them off and let us know.
 

 Thanks, I had both options enabled. Disabling these makes the 
   

Can you check which one causes the trouble?  Most likely it's the clock.


-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH] Fix hw/acpi.c build w/ DEBUG enabled

2008-05-15 Thread Avi Kivity
Alex Williamson wrote:
 Trivial build warning/fixes when the local DEBUG define is enabled.

   

Applied, thanks.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] KVM: Enable NMI Watchdog by PIT source

2008-05-15 Thread Yang, Sheng
From b410060a395356eb4bca3ae31de7acb8c261b3f1 Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Thu, 15 May 2008 18:23:27 +0800
Subject: [PATCH] KVM: Enable NMI Watchdog by PIT source

The NMI watchdog used LINT0 of LAPIC to deliver NMI. It didn't disable PIC 
after
switch to IOAPIC, but program LVT0 of every LAPIC as NMI, then deliver PIT
interrupt to LINT0. So NMIs got the same generate freqency as PIT interrupts.

The patch emulated this process and enabled NMI watchdog. For currently KVM, 
in
fact we didn't connected PIC to LAPIC, so the patch bypassed PIC, sent the
signal directly to the LAPIC.

Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/i8254.c |   16 
 arch/x86/kvm/irq.h   |1 +
 arch/x86/kvm/lapic.c |   32 
 3 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 6d6dc6c..7c6ea62 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -563,12 +563,28 @@ void kvm_free_pit(struct kvm *kvm)

 static void __inject_pit_timer_intr(struct kvm *kvm)
 {
+   int i;
+   struct kvm_vcpu *vcpu;
+
mutex_lock(kvm-lock);
kvm_ioapic_set_irq(kvm-arch.vioapic, 0, 1);
kvm_ioapic_set_irq(kvm-arch.vioapic, 0, 0);
kvm_pic_set_irq(pic_irqchip(kvm), 0, 1);
kvm_pic_set_irq(pic_irqchip(kvm), 0, 0);
mutex_unlock(kvm-lock);
+
+   /*
+* For NMI watchdog in IOAPIC mode
+* After IOAPIC enabled, NMI watchdog programmed LVT0 of lapic as NMI,
+* then a timer interrupt through IOAPIC and a NMI through PIC to lapic
+* would be delivered when PIT time up.
+*/
+   for (i = 0; i  KVM_MAX_VCPUS; ++i) {
+   vcpu = kvm-vcpus[i];
+   if (!vcpu)
+   continue;
+   kvm_apic_local_deliver(vcpu, APIC_LVT0);
+   }
 }

 void kvm_inject_pit_timer_irqs(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/irq.h b/arch/x86/kvm/irq.h
index 1802134..700 100644
--- a/arch/x86/kvm/irq.h
+++ b/arch/x86/kvm/irq.h
@@ -84,6 +84,7 @@ void kvm_timer_intr_post(struct kvm_vcpu *vcpu, int vec);
 void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
 void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
 void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
+int kvm_apic_local_deliver(struct kvm_vcpu *vcpu, int lvt_type);

 int pit_has_pending_timer(struct kvm_vcpu *vcpu);
 int apic_has_pending_timer(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 2273836..62f70a1 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -388,6 +388,14 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int 
delivery_mode,
}
break;

+   case APIC_DM_EXTINT:
+   /*
+* Should only be called by kvm_apic_local_deliver() with LVT0,
+* before NMI watchdog was enabled. Already handled by
+* kvm_apic_accept_pic_intr().
+*/
+   break;
+
default:
printk(KERN_ERR TODO: unsupported delivery mode %x\n,
   delivery_mode);
@@ -753,6 +761,9 @@ static void apic_mmio_write(struct kvm_io_device *this,
case APIC_LVTTHMR:
case APIC_LVTPC:
case APIC_LVT0:
+   if (val == APIC_DM_NMI)
+   apic_debug(Receive NMI setting on APIC_LVT0 
+   for cpu %d\n, apic-vcpu-vcpu_id);
case APIC_LVT1:
case APIC_LVTERR:
/* TODO: Check vector */
@@ -968,12 +979,25 @@ int apic_has_pending_timer(struct kvm_vcpu *vcpu)
return 0;
 }

-static int __inject_apic_timer_irq(struct kvm_lapic *apic)
+int kvm_apic_local_deliver(struct kvm_vcpu *vcpu, int lvt_type)
 {
-   int vector;
+   struct kvm_lapic *apic = vcpu-arch.apic;
+   int vector, mode, trig_mode;
+   u32 reg;
+
+   if (apic  apic_enabled(apic)) {
+   reg = apic_get_reg(apic, lvt_type);
+   vector = reg  APIC_VECTOR_MASK;
+   mode = reg  APIC_MODE_MASK;
+   trig_mode = reg  APIC_LVT_LEVEL_TRIGGER;
+   return __apic_accept_irq(apic, mode, vector, 1, trig_mode);
+   }
+   return 0;
+}

-   vector = apic_lvt_vector(apic, APIC_LVTT);
-   return __apic_accept_irq(apic, APIC_DM_FIXED, vector, 1, 0);
+static int __inject_apic_timer_irq(struct kvm_lapic *apic)
+{
+   return kvm_apic_local_deliver(apic-vcpu, APIC_LVTT);
 }

 static enum hrtimer_restart apic_timer_fn(struct hrtimer *data)
--
1.5.5

From b410060a395356eb4bca3ae31de7acb8c261b3f1 Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Thu, 15 May 2008 18:23:27 +0800
Subject: [PATCH] KVM: Enable NMI Watchdog by PIT source

The NMI watchdog used LINT0 of LAPIC to deliver NMI. It didn't disable PIC after
switch to IOAPIC, but program LVT0 of every LAPIC as NMI, 

[kvm-devel] [PATCH 2/2] KVM: VMX: Enable NMI with in-kernel irqchip

2008-05-15 Thread Yang, Sheng
From 069c50dca077796101af3eb5890e3fd31a72743f Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Thu, 15 May 2008 18:23:25 +0800
Subject: [PATCH] KVM: VMX: Enable NMI with in-kernel irqchip


Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/vmx.c |  125 +--
 arch/x86/kvm/vmx.h |   12 -
 arch/x86/kvm/x86.c |1 +
 include/asm-x86/kvm_host.h |1 +
 4 files changed, 120 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e94a8c3..134c75e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -264,6 +264,12 @@ static inline int cpu_has_vmx_vpid(void)
SECONDARY_EXEC_ENABLE_VPID);
 }

+static inline int cpu_has_virtual_nmis(void)
+{
+   return (vmcs_config.pin_based_exec_ctrl 
+   PIN_BASED_VIRTUAL_NMIS);
+}
+
 static int __find_msr_index(struct vcpu_vmx *vmx, u32 msr)
 {
int i;
@@ -1083,7 +1089,7 @@ static __init int setup_vmcs_config(struct vmcs_config 
*vmcs_conf)
u32 _vmentry_control = 0;

min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
-   opt = 0;
+   opt = PIN_BASED_VIRTUAL_NMIS;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
_pin_based_exec_control)  0)
return -EIO;
@@ -2125,6 +2131,13 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu, int 
irq)
irq | INTR_TYPE_EXT_INTR | INTR_INFO_VALID_MASK);
 }

+static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
+{
+   vmcs_write32(VM_ENTRY_INTR_INFO_FIELD,
+   INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK | NMI_VECTOR);
+   vcpu-arch.nmi_pending = 0;
+}
+
 static void kvm_do_inject_irq(struct kvm_vcpu *vcpu)
 {
int word_index = __ffs(vcpu-arch.irq_summary);
@@ -2648,6 +2661,19 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu, 
struct kvm_run *kvm_run)
return 1;
 }

+static int handle_nmi_window(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
+{
+   u32 cpu_based_vm_exec_control;
+
+   /* clear pending NMI */
+   cpu_based_vm_exec_control = vmcs_read32(CPU_BASED_VM_EXEC_CONTROL);
+   cpu_based_vm_exec_control = ~CPU_BASED_VIRTUAL_NMI_PENDING;
+   vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control);
+   ++vcpu-stat.nmi_window_exits;
+
+   return 1;
+}
+
 /*
  * The exit handlers return 1 if the exit was handled fully and guest 
execution
  * may resume.  Otherwise they set the kvm_run parameter to indicate what 
needs
@@ -2658,6 +2684,7 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu 
*vcpu,
[EXIT_REASON_EXCEPTION_NMI]   = handle_exception,
[EXIT_REASON_EXTERNAL_INTERRUPT]  = handle_external_interrupt,
[EXIT_REASON_TRIPLE_FAULT]= handle_triple_fault,
+   [EXIT_REASON_NMI_WINDOW]  = handle_nmi_window,
[EXIT_REASON_IO_INSTRUCTION]  = handle_io,
[EXIT_REASON_CR_ACCESS]   = handle_cr,
[EXIT_REASON_DR_ACCESS]   = handle_dr,
@@ -2745,17 +2772,52 @@ static void enable_irq_window(struct kvm_vcpu *vcpu)
vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control);
 }

+static void enable_nmi_window(struct kvm_vcpu *vcpu)
+{
+   u32 cpu_based_vm_exec_control;
+
+   if (!cpu_has_virtual_nmis())
+   return;
+
+   cpu_based_vm_exec_control = vmcs_read32(CPU_BASED_VM_EXEC_CONTROL);
+   cpu_based_vm_exec_control |= CPU_BASED_VIRTUAL_NMI_PENDING;
+   vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, cpu_based_vm_exec_control);
+}
+
+static int vmx_nmi_enabled(struct kvm_vcpu *vcpu)
+{
+   u32 guest_intr = vmcs_read32(GUEST_INTERRUPTIBILITY_INFO);
+   return !(guest_intr  (GUEST_INTR_STATE_NMI |
+  GUEST_INTR_STATE_MOV_SS |
+  GUEST_INTR_STATE_STI));
+}
+
+static int vmx_irq_enabled(struct kvm_vcpu *vcpu)
+{
+   u32 guest_intr = vmcs_read32(GUEST_INTERRUPTIBILITY_INFO);
+   return (!(guest_intr  (GUEST_INTR_STATE_MOV_SS |
+  GUEST_INTR_STATE_STI)) 
+   (vmcs_readl(GUEST_RFLAGS)  X86_EFLAGS_IF));
+}
+
+static void enable_intr_window(struct kvm_vcpu *vcpu)
+{
+   if (vcpu-arch.nmi_pending)
+   enable_nmi_window(vcpu);
+   else if (kvm_cpu_has_interrupt(vcpu))
+   enable_irq_window(vcpu);
+}
+
 static void vmx_intr_assist(struct kvm_vcpu *vcpu)
 {
struct vcpu_vmx *vmx = to_vmx(vcpu);
-   u32 idtv_info_field, intr_info_field;
-   int has_ext_irq, interrupt_window_open;
+   u32 idtv_info_field, intr_info_field, exit_intr_info_field;
int vector;

update_tpr_threshold(vcpu);

-   has_ext_irq = kvm_cpu_has_interrupt(vcpu);
intr_info_field = vmcs_read32(VM_ENTRY_INTR_INFO_FIELD);
+   exit_intr_info_field = vmcs_read32(VM_EXIT_INTR_INFO);
  

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-15 Thread Robin Holt
We are pursuing Linus' suggestion currently.  This discussion is
completely unrelated to that work.

On Thu, May 15, 2008 at 09:57:47AM +0200, Nick Piggin wrote:
 I'm not sure if you're thinking about what I'm thinking of. With the
 scheme I'm imagining, all you will need is some way to raise an IPI-like
 interrupt on the target domain. The IPI target will have a driver to
 handle the interrupt, which will determine the mm and virtual addresses
 which are to be invalidated, and will then tear down those page tables
 and issue hardware TLB flushes within its domain. On the Linux side,
 I don't see why this can't be done.

We would need to deposit the payload into a central location to do the
invalidate, correct?  That central location would either need to be
indexed by physical cpuid (65536 possible currently, UV will push that
up much higher) or some sort of global id which is difficult because
remote partitions can reboot giving you a different view of the machine
and running partitions would need to be updated.  Alternatively, that
central location would need to be protected by a global lock or atomic
type operation, but a majority of the machine does not have coherent
access to other partitions so they would need to use uncached operations.
Essentially, take away from this paragraph that it is going to be really
slow or really large.

Then we need to deposit the information needed to do the invalidate.

Lastly, we would need to interrupt.  Unfortunately, here we have a
thundering herd.  There could be up to 16256 processors interrupting the
same processor.  That will be a lot of work.  It will need to look up the
mm (without grabbing any sleeping locks in either xpmem or the kernel)
and do the tlb invalidates.

Unfortunately, the sending side is not free to continue (in most cases)
until it knows that the invalidate is completed.  So it will need to spin
waiting for a completion signal will could be as simple as an uncached
word.  But how will it handle the possible failure of the other partition?
How will it detect that failure and recover?  A timeout value could be
difficult to gauge because the other side may be off doing a considerable
amount of work and may just be backed up.

 Sure, you obviously would need to rework your code because it's been
 written with the assumption that it can sleep.

It is an assumption based upon some of the kernel functions we call
doing things like grabbing mutexes or rw_sems.  That pushes back to us.
I think the kernel's locking is perfectly reasonable.  The problem we run
into is we are trying to get from one context in one kernel to a different
context in another and the in-between piece needs to be sleepable.

 What is XPMEM exactly anyway? I'd assumed it is a Linux driver.

XPMEM allows one process to make a portion of its virtual address range
directly addressable by another process with the appropriate access.
The other process can be on other partitions.  As long as Numa-link
allows access to the memory, we can make it available.  Userland has an
advantage in that the kernel entrance/exit code contains memory errors
so we can contain hardware failures (in most cases) to only needing to
terminate a user program and not lose the partition.  The kernel enjoys
no such fault containment so it can not safely directly reference memory.


Thanks,
Robin

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 0/2] NMI supporting for KVM and VMX v2

2008-05-15 Thread Yang, Sheng
Hi

Sorry for the update delays, got a cold recently...

No big modification. I dropped the ordinary first patch following Avi's 
comment, and fixed a bug when handling host NMI in vmx_vcpu_run in second 
patch.

The third patch of enabling NMI watchdog is *not* meant to be merged. It was 
cooked for the test, and it would bring some overhead on interrupt handling, 
as well as regression on some version of Windows now(IRQ_NOT_EQUAL_OR_LESS 
BSOD). But is it necessary to got NMI watchdog support in KVM? If so, maybe 
we need a option(or kernel module parameter) to enable it.

--
Thanks
Yang, Sheng

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] can't boot 2.6.26-rcX

2008-05-15 Thread Bernd Schubert
On Thursday 15 May 2008 09:36:41 Avi Kivity wrote:
 Bernd Schubert wrote:
  Hello,
 
  there is a problem booting 2.6.26-rcX (X=1,2). It stops booting at
 
  Calibrating delay using timer specific routine.. 4016.92 BogoMIPS
  (lpj=8033846)
 
  The kvm process then takes 100% of my host CPU.
 
  This is with kvm-67 on an AM64-X2-
 
  I'm not yet familiar with kvm and debugging. Will a sysrq+t trace of the
  host show something useful? Or will only full git-bisect help?

 Do you have CONFIG_KVM_GUEST or CONFIG_KVM_CLOCK in your config?  If so,
 this may be a paravirt problem.  Try turning them off and let us know.

Thanks, I had both options enabled. Disabling these makes the 
kernel to boot again. But I just run into another bug:

[17180751.908653] BUG: unable to handle kernel NULL pointer dereference at 
0008
[17180751.911775] IP: [803dff37] start_xmit+0x41/0x11a


Going to support with debugging support now to get a better backtrace.


Thanks,
Bernd

-- 
Bernd Schubert
Q-Leap Networks GmbH

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 1/2] KVM: IOAPIC/LAPIC: Enable NMI support

2008-05-15 Thread Yang, Sheng
From 16680d2556ad065b128412b0f5d81f04de25b3f8 Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Thu, 15 May 2008 09:52:48 +0800
Subject: [PATCH] KVM: IOAPIC/LAPIC: Enable NMI support


Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/lapic.c   |3 ++-
 arch/x86/kvm/x86.c |6 ++
 include/asm-x86/kvm_host.h |4 
 virt/kvm/ioapic.c  |   20 ++--
 4 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 7652f88..2273836 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -356,8 +356,9 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int 
delivery_mode,
case APIC_DM_SMI:
printk(KERN_DEBUG Ignoring guest SMI\n);
break;
+
case APIC_DM_NMI:
-   printk(KERN_DEBUG Ignoring guest NMI\n);
+   kvm_inject_nmi(vcpu);
break;

case APIC_DM_INIT:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dab3d4f..8461da4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -173,6 +173,12 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu, 
unsigned long addr,
kvm_queue_exception_e(vcpu, PF_VECTOR, error_code);
 }

+void kvm_inject_nmi(struct kvm_vcpu *vcpu)
+{
+   vcpu-arch.nmi_pending = 1;
+}
+EXPORT_SYMBOL_GPL(kvm_inject_nmi);
+
 void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 
error_code)
 {
WARN_ON(vcpu-arch.exception.pending);
diff --git a/include/asm-x86/kvm_host.h b/include/asm-x86/kvm_host.h
index 1466c3f..567d739 100644
--- a/include/asm-x86/kvm_host.h
+++ b/include/asm-x86/kvm_host.h
@@ -285,6 +285,8 @@ struct kvm_vcpu_arch {
struct kvm_vcpu_time_info hv_clock;
unsigned int time_offset;
struct page *time_page;
+
+   bool nmi_pending;
 };

 struct kvm_mem_alias {
@@ -512,6 +514,8 @@ void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned 
nr, u32 error_code);
 void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long cr2,
   u32 error_code);

+void kvm_inject_nmi(struct kvm_vcpu *vcpu);
+
 void fx_init(struct kvm_vcpu *vcpu);

 int emulator_read_std(unsigned long addr,
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 4232fd7..99a1736 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -146,6 +146,11 @@ static void ioapic_inj_irq(struct kvm_ioapic *ioapic,
kvm_apic_set_irq(vcpu, vector, trig_mode);
 }

+static void ioapic_inj_nmi(struct kvm_vcpu *vcpu)
+{
+   kvm_inject_nmi(vcpu);
+}
+
 static u32 ioapic_get_delivery_bitmask(struct kvm_ioapic *ioapic, u8 dest,
   u8 dest_mode)
 {
@@ -239,8 +244,19 @@ static void ioapic_deliver(struct kvm_ioapic *ioapic, int 
irq)
}
}
break;
-
-   /* TODO: NMI */
+   case IOAPIC_NMI:
+   for (vcpu_id = 0; deliver_bitmask != 0; vcpu_id++) {
+   if (!(deliver_bitmask  (1  vcpu_id)))
+   continue;
+   deliver_bitmask = ~(1  vcpu_id);
+   vcpu = ioapic-kvm-vcpus[vcpu_id];
+   if (vcpu)
+   ioapic_inj_nmi(vcpu);
+   else
+   ioapic_debug(NMI to vcpu %d failed\n,
+   vcpu-vcpu_id);
+   }
+   break;
default:
printk(KERN_WARNING Unsupported delivery mode %d\n,
   delivery_mode);
--
1.5.5

From 16680d2556ad065b128412b0f5d81f04de25b3f8 Mon Sep 17 00:00:00 2001
From: Sheng Yang [EMAIL PROTECTED]
Date: Thu, 15 May 2008 09:52:48 +0800
Subject: [PATCH] KVM: IOAPIC/LAPIC: Enable NMI support


Signed-off-by: Sheng Yang [EMAIL PROTECTED]
---
 arch/x86/kvm/lapic.c   |3 ++-
 arch/x86/kvm/x86.c |6 ++
 include/asm-x86/kvm_host.h |4 
 virt/kvm/ioapic.c  |   20 ++--
 4 files changed, 30 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 7652f88..2273836 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -356,8 +356,9 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
 	case APIC_DM_SMI:
 		printk(KERN_DEBUG Ignoring guest SMI\n);
 		break;
+
 	case APIC_DM_NMI:
-		printk(KERN_DEBUG Ignoring guest NMI\n);
+		kvm_inject_nmi(vcpu);
 		break;
 
 	case APIC_DM_INIT:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dab3d4f..8461da4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -173,6 +173,12 @@ void kvm_inject_page_fault(struct kvm_vcpu *vcpu, unsigned long addr,
 	kvm_queue_exception_e(vcpu, PF_VECTOR, error_code);
 }
 
+void kvm_inject_nmi(struct kvm_vcpu *vcpu)
+{
+	vcpu-arch.nmi_pending = 1;
+}
+EXPORT_SYMBOL_GPL(kvm_inject_nmi);
+
 void 

Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-15 Thread Avi Kivity
Robin Holt wrote:
 Then we need to deposit the information needed to do the invalidate.

 Lastly, we would need to interrupt.  Unfortunately, here we have a
 thundering herd.  There could be up to 16256 processors interrupting the
 same processor.  That will be a lot of work.  It will need to look up the
 mm (without grabbing any sleeping locks in either xpmem or the kernel)
 and do the tlb invalidates.

   

You don't need to interrupt every time.  Place your data in a queue (you 
do support rmw operations, right?) and interrupt.  Invalidates from 
other processors will see that the queue hasn't been processed yet and 
skip the interrupt.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] [ACPI] Enable _SUN in Slot devices v2

2008-05-15 Thread Alexander Graf

Hi,

a patch recently introduced PCI device hotplugging. This added
pseudo-buses for every PCI slot, so that each device can be easily
ejected any time. The ACPI specification recommends the inclusion of a
_SUN entry in these though, to enable proper indexation of the slots.
Afaict Linux does not need this, but Darwin does.

This patch adds the corresponding _SUN entries to the PCI slot definitions.

Regards,

Alex

Signed-off-by: Alexander Graf [EMAIL PROTECTED]


diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl
index d2e33f4..f718b2e 100755
--- a/bios/acpi-dsdt.dsl
+++ b/bios/acpi-dsdt.dsl
@@ -426,6 +712,7 @@ DefinitionBlock (
 Store(0x2, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 1)
 }
 
 Device (S2) {  // Slot 2
@@ -434,6 +721,7 @@ DefinitionBlock (
 Store(0x4, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 2)
 }
 
 Device (S3) {  // Slot 3
@@ -442,6 +730,7 @@ DefinitionBlock (
 Store (0x8, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 3)
 }
 
 Device (S4) {  // Slot 4
@@ -450,6 +739,7 @@ DefinitionBlock (
 Store(0x10, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 4)
 }
 
 Device (S5) {  // Slot 5
@@ -458,6 +748,7 @@ DefinitionBlock (
 Store(0x20, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 5)
 }
 
 Device (S6) {  // Slot 6
@@ -466,6 +757,7 @@ DefinitionBlock (
 Store(0x40, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 6)
 }
 
 Device (S7) {  // Slot 7
@@ -474,6 +766,7 @@ DefinitionBlock (
 Store(0x80, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 7)
 }
 
 Device (S8) {  // Slot 8
@@ -482,6 +775,7 @@ DefinitionBlock (
 Store(0x100, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 8)
 }
 
 Device (S9) {  // Slot 9
@@ -490,6 +784,7 @@ DefinitionBlock (
 Store(0x200, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 9)
 }
 
 Device (S10) {  // Slot 10
@@ -498,6 +793,7 @@ DefinitionBlock (
 Store(0x400, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 10)
 }
 
 Device (S11) {  // Slot 11
@@ -506,6 +802,7 @@ DefinitionBlock (
 Store(0x800, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 11)
 }
 
 Device (S12) {  // Slot 12
@@ -514,6 +811,7 @@ DefinitionBlock (
 Store(0x1000, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 13)
 }
 
 Device (S13) {  // Slot 13
@@ -522,6 +820,7 @@ DefinitionBlock (
 Store(0x2000, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 13)
 }
 
 Device (S14) {  // Slot 14
@@ -530,6 +829,7 @@ DefinitionBlock (
 Store(0x4000, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 14)
 }
 
 Device (S15) {  // Slot 15
@@ -538,6 +838,7 @@ DefinitionBlock (
 Store(0x8000, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 15)
 }
 
 Device (S16) {  // Slot 16
@@ -546,6 +847,7 @@ DefinitionBlock (
 Store(0x1, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 16)
 }
 
 Device (S17) {  // Slot 17
@@ -554,6 +856,7 @@ DefinitionBlock (
 Store(0x2, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 17)
 }
 
 Device (S18) {  // Slot 18
@@ -562,6 +865,7 @@ DefinitionBlock (
 Store(0x4, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 18)
 }
 
 Device (S19) {  // Slot 19
@@ -570,6 +874,7 @@ DefinitionBlock (
 Store(0x8, B0EJ)
 Return (0x0)
}
+   Name (_SUN, 19)
 }
 
 Device (S20) {  // Slot 20
@@ -578,6 +883,7 @@ 

[kvm-devel] [PATCH] [ACPI] Enable direct GSI mapping for APIC v2

2008-05-15 Thread Alexander Graf

Hi,

in the DSDT there are two different ways of defining, how an interrupt
is supposed to be routed. Currently we are using the LNKA - LNKD method,
which afaict is for legacy support.
The other method is to directly tell the Operating System, which APIC
pin the device is attached to. We can get that information from the very
same entry, the LNKA to LNKD pseudo devices receive it.

For now this does not give any obvious improvement. It does leave room
for more advanced mappings, with several IOAPICs that can handle more
devices separately. This might help when we have a lot of devices, as
currently all devices sit on two interrupt lanes.

More importantly (for me) though, is that Darwin enables the APIC mode
unconditionally, so it won't easily run in legacy mode.

Regards,

Alex

Signed-off-by: Alexander Graf [EMAIL PROTECTED]




diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl
index d2e33f4..f718b2e 100755
--- a/bios/acpi-dsdt.dsl
+++ b/bios/acpi-dsdt.dsl
@@ -199,4 +199,10 @@ DefinitionBlock (
 {
 DBGL,   32,
 }
+/* PIC mode setting */
+Name (PICF, 0x00)
+Method (_PIC, 1, NotSerialized)
+{
+Store(Arg0, PICF)
+}
 }
@@ -199,10 +199,204 @@ DefinitionBlock (
 Device(PCI0) {
 Name (_HID, EisaId (PNP0A03))
 Name (_ADR, 0x00)
 Name (_UID, 1)
-Name(_PRT, Package() {
+Name(APRT, Package() {
+// PCI Slot 0
+Package() {0x, 0, 0, ARQ3},
+Package() {0x, 1, 0, ARQ0},
+Package() {0x, 2, 0, ARQ1},
+Package() {0x, 3, 0, ARQ2},
+
+// PCI Slot 1
+Package() {0x0001, 0, 0, ARQ0},
+Package() {0x0001, 1, 0, ARQ1},
+Package() {0x0001, 2, 0, ARQ2},
+Package() {0x0001, 3, 0, ARQ3},
+
+// PCI Slot 2
+Package() {0x0002, 0, 0, ARQ1},
+Package() {0x0002, 1, 0, ARQ2},
+Package() {0x0002, 2, 0, ARQ3},
+Package() {0x0002, 3, 0, ARQ0},
+
+// PCI Slot 3
+Package() {0x0003, 0, 0, ARQ2},
+Package() {0x0003, 1, 0, ARQ3},
+Package() {0x0003, 2, 0, ARQ0},
+Package() {0x0003, 3, 0, ARQ1},
+
+// PCI Slot 4
+Package() {0x0004, 0, 0, ARQ3},
+Package() {0x0004, 1, 0, ARQ0},
+Package() {0x0004, 2, 0, ARQ1},
+Package() {0x0004, 3, 0, ARQ2},
+
+// PCI Slot 5
+Package() {0x0005, 0, 0, ARQ0},
+Package() {0x0005, 1, 0, ARQ1},
+Package() {0x0005, 2, 0, ARQ2},
+Package() {0x0005, 3, 0, ARQ3},
+
+// PCI Slot 6
+Package() {0x0006, 0, 0, ARQ1},
+Package() {0x0006, 1, 0, ARQ2},
+Package() {0x0006, 2, 0, ARQ3},
+Package() {0x0006, 3, 0, ARQ0},
+
+// PCI Slot 7
+Package() {0x0007, 0, 0, ARQ2},
+Package() {0x0007, 1, 0, ARQ3},
+Package() {0x0007, 2, 0, ARQ0},
+Package() {0x0007, 3, 0, ARQ1},
+
+// PCI Slot 8
+Package() {0x0008, 0, 0, ARQ3},
+Package() {0x0008, 1, 0, ARQ0},
+Package() {0x0008, 2, 0, ARQ1},
+Package() {0x0008, 3, 0, ARQ2},
+
+// PCI Slot 9
+Package() {0x0009, 0, 0, ARQ0},
+Package() {0x0009, 1, 0, ARQ1},
+Package() {0x0009, 2, 0, ARQ2},
+Package() {0x0009, 3, 0, ARQ3},
+
+// PCI Slot 10
+Package() {0x000a, 0, 0, ARQ1},
+Package() {0x000a, 1, 0, ARQ2},
+Package() {0x000a, 2, 0, ARQ3},
+Package() {0x000a, 3, 0, ARQ0},
+
+// PCI Slot 11
+Package() {0x000b, 0, 0, ARQ2},
+Package() {0x000b, 1, 0, ARQ3},
+Package() {0x000b, 2, 0, ARQ0},
+Package() {0x000b, 3, 0, ARQ1},
+
+// PCI Slot 12
+Package() {0x000c, 0, 0, ARQ3},
+Package() {0x000c, 1, 0, ARQ0},
+Package() {0x000c, 2, 0, ARQ1},
+Package() {0x000c, 3, 0, ARQ2},
+
+// PCI Slot 13
+Package() {0x000d, 0, 0, ARQ0},
+Package() {0x000d, 1, 0, ARQ1},
+Package() {0x000d, 2, 0, ARQ2},
+Package() {0x000d, 3, 0, ARQ3},
+
+// PCI Slot 14
+Package() {0x000e, 0, 0, ARQ1},
+

Re: [kvm-devel] [Qemu-devel] Re: [PATCH] Add support for a configuration file

2008-05-15 Thread Daniel P. Berrange
On Thu, May 15, 2008 at 11:04:47AM +0300, Avi Kivity wrote:
 Daniel P. Berrange wrote:
 With this kind of syntax, now tools generating config files need to make
 up unique names for each drive. So you'll probably end up with them just
 naming things based on the class name + a number appended.
   
 
 I would hope that tools don't have to resort to reading and writing 
 these config files.  Usually a management system would prefer storing 
 parameters in its own database, and writing a temporary config file just 
 to pass the data seems awkward.  I would much prefer to see the command 
 line and monitor retain full control over every configurable parameter.

I expect that libvirt will create config files - it is only a matter of
time before we hit the command line ARGV length limits - particularly
with the -net and -drive syntax. People already requesting that we support
guests with  16 disks, and  8 network cards so command lines get very
long. 

I wouldn't write out the config file to disk though - I'd just send it
on the fly on stdin -, eg   'qemu -config -'  to tell it to read the config
on its stdin.

Regards,
Daniel.
-- 
|: Red Hat, Engineering, Boston   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] Re: [PATCH] Add support for a configuration file

2008-05-15 Thread Avi Kivity
Daniel P. Berrange wrote:
 On Thu, May 15, 2008 at 11:04:47AM +0300, Avi Kivity wrote:
   
 Daniel P. Berrange wrote:
 
 With this kind of syntax, now tools generating config files need to make
 up unique names for each drive. So you'll probably end up with them just
 naming things based on the class name + a number appended.
  
   
 I would hope that tools don't have to resort to reading and writing 
 these config files.  Usually a management system would prefer storing 
 parameters in its own database, and writing a temporary config file just 
 to pass the data seems awkward.  I would much prefer to see the command 
 line and monitor retain full control over every configurable parameter.
 

 I expect that libvirt will create config files - it is only a matter of
 time before we hit the command line ARGV length limits - particularly
 with the -net and -drive syntax. People already requesting that we support
 guests with  16 disks, and  8 network cards so command lines get very
 long. 
   

What are those limits, btw? ISTR 10240 words, but how many chars?

 I wouldn't write out the config file to disk though - I'd just send it
 on the fly on stdin -, eg   'qemu -config -'  to tell it to read the config
 on its stdin.
   

That's fine from my point of view.

-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] Re: [PATCH] Add support for a configuration file

2008-05-15 Thread Laurent Vivier
Le jeudi 15 mai 2008 à 15:04 +0300, Avi Kivity a écrit :
 Daniel P. Berrange wrote:
  On Thu, May 15, 2008 at 11:04:47AM +0300, Avi Kivity wrote:

  Daniel P. Berrange wrote:
  
  With this kind of syntax, now tools generating config files need to make
  up unique names for each drive. So you'll probably end up with them just
  naming things based on the class name + a number appended.
   

  I would hope that tools don't have to resort to reading and writing 
  these config files.  Usually a management system would prefer storing 
  parameters in its own database, and writing a temporary config file just 
  to pass the data seems awkward.  I would much prefer to see the command 
  line and monitor retain full control over every configurable parameter.
  
 
  I expect that libvirt will create config files - it is only a matter of
  time before we hit the command line ARGV length limits - particularly
  with the -net and -drive syntax. People already requesting that we support
  guests with  16 disks, and  8 network cards so command lines get very
  long. 

 
 What are those limits, btw? ISTR 10240 words, but how many chars?

ARG_MAX - _SC_ARG_MAX
The  maximum  length  of  the arguments to the exec(3) family of
functions.  Must not be less than _POSIX_ARG_MAX (4096).

getconf ARG_MAX
131072

And from a configure log I have:

checking the maximum length of command line arguments: 98304

Regards,
Laurent
-- 
- [EMAIL PROTECTED] ---
The best way to predict the future is to invent it.
- Alan Kay


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] Remove unused get_bios_map

2008-05-15 Thread Jan Kiszka
Dead code since it was introduced. Is it planned to use it in the near
future? Then I would suggest to put it under #if 0 for now. Otherwise,
please pick up this patch.

Signed-off-by: Jan Kiszka [EMAIL PROTECTED]
---
 qemu/kvm-tpr-opt.c |   13 -
 1 file changed, 13 deletions(-)

Index: b/qemu/kvm-tpr-opt.c
===
--- a/qemu/kvm-tpr-opt.c
+++ b/qemu/kvm-tpr-opt.c
@@ -83,19 +83,6 @@ static void write_byte_virt(CPUState *en
 stb_phys(map_addr(sregs, virt, NULL), b);
 }
 
-static uint32_t get_bios_map(CPUState *env, unsigned *perms)
-{
-uint32_t v;
-struct kvm_sregs sregs;
-
-kvm_get_sregs(kvm_context, env-cpu_index, sregs);
-
-for (v = -4096u; v != 0; v -= 4096)
-   if (map_addr(sregs, v, perms) == 0xe)
-   return v;
-return -1u;
-}
-
 struct vapic_bios {
 char signature[8];
 uint32_t virt_base;

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH] Silence warnings in migration.c

2008-05-15 Thread Jan Kiszka
These warnings continued to bug me (while scanning for my own mess).

Signed-off-by: Jan Kiszka [EMAIL PROTECTED]
---
 qemu/migration.c |7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

Index: b/qemu/migration.c
===
--- a/qemu/migration.c
+++ b/qemu/migration.c
@@ -834,8 +834,8 @@ static int migrate_incoming_fd(int fd)
 
 size = qemu_get_be32(f);
 if (size != phys_ram_size) {
-fprintf(stderr, migration: memory size mismatch: recv %u mine %u\n,
-size, phys_ram_size);
+fprintf(stderr, migration: memory size mismatch: recv %u mine %llu\n,
+size, (unsigned long long)phys_ram_size);
return MIG_STAT_DST_MEM_SIZE_MISMATCH;
 }
 
@@ -1090,7 +1090,8 @@ void do_info_migration(void)
term_printf(Transfer rate %3.1f mb/s\n,
(double)s-bps / (1024 * 1024));
term_printf(Iteration %d\n, s-iteration);
-   term_printf(Transferred %d/%d pages\n, s-updated_pages, 
phys_ram_size  TARGET_PAGE_BITS);
+   term_printf(Transferred %d/%llu pages\n, s-updated_pages,
+   (unsigned long long)phys_ram_size  TARGET_PAGE_BITS);
if (s-iteration)
term_printf(Last iteration found %d dirty pages\n, 
s-last_updated_pages);
 } else {

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 1/2][RFC][v2] kvm: Batch writes to MMIO

2008-05-15 Thread Laurent Vivier
This patch is the kernel part of the batch writes to MMIO patch.

It intoduces the ioctl interface to define MMIO zone it is allowed to delay.
Inside a zone, we can define sub-part we must not delay.

If an MMIO can be delayed, it is stored in a ring buffer which common for all 
VCPUs.

Signed-off-by: Laurent Vivier [EMAIL PROTECTED]
---
 arch/x86/kvm/x86.c |  172 
 include/asm-x86/kvm.h  |7 ++
 include/asm-x86/kvm_host.h |   23 ++
 include/linux/kvm.h|   16 
 virt/kvm/kvm_main.c|3 +
 5 files changed, 221 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index dab3d4f..930986b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1518,6 +1518,103 @@ out:
return r;
 }
 
+static struct kvm_delayed_mmio_zone *kvm_mmio_find_zone(struct kvm *kvm,
+   u64 addr, u32 size)
+{
+   int i;
+   struct kvm_delayed_mmio_zone *zone;
+
+   for (i = 0; i  kvm-arch.nb_mmio_zones; i++) {
+   zone = kvm-arch.mmio_zone[i];
+
+   /* (addr,size) is fully included in
+* (zone-addr, zone-size)
+*/
+
+   if (zone-addr = addr 
+   addr + size = zone-addr + zone-size)
+   return zone;
+   }
+   return NULL;
+}
+
+static struct kvm_excluded_mmio_zone *
+kvm_mmio_find_excluded(struct kvm_delayed_mmio_zone *zone, u64 addr, u32 size)
+{
+   static struct kvm_excluded_mmio_zone *excluded;
+   int i;
+
+   addr -= zone-addr;
+   for (i = 0; i  zone-nb_excluded_zones; i++) {
+   excluded = zone-excluded[i];
+
+   if ((excluded-offset = addr 
+addr  excluded-offset + excluded-size) ||
+(excluded-offset  addr + size 
+ addr + size = excluded-offset +
+   excluded-size))
+   return excluded;
+   }
+   return NULL;
+}
+
+static int kvm_is_delayed_mmio(struct kvm *kvm, u64 addr, u32 size)
+{
+   struct kvm_delayed_mmio_zone *zone;
+   struct kvm_excluded_mmio_zone *excluded;
+
+   zone = kvm_mmio_find_zone(kvm, addr, size);
+   if (zone == NULL)
+   return 0;   /* not a delayed MMIO address */
+
+   excluded = kvm_mmio_find_excluded(zone, addr, size);
+   return excluded == NULL;
+}
+
+static int kvm_vm_ioctl_set_mmio(struct kvm *kvm,
+struct kvm_mmio_zone *zone)
+{
+   struct kvm_delayed_mmio_zone *z;
+
+   if (zone-is_delayed 
+   kvm-arch.nb_mmio_zones = KVM_MAX_DELAYED_MMIO_ZONE)
+   return -ENOMEM;
+
+   if (zone-is_delayed) {
+
+   /* already defined ? */
+
+   if (kvm_mmio_find_zone(kvm, zone-addr, 1) ||
+   kvm_mmio_find_zone(kvm, zone-addr + zone-size - 1, 1))
+   return 0;
+
+   z = kvm-arch.mmio_zone[kvm-arch.nb_mmio_zones];
+   z-addr = zone-addr;
+   z-size = zone-size;
+   kvm-arch.nb_mmio_zones++;
+   return 0;
+   }
+
+   /* exclude some parts of the delayed MMIO zone */
+
+   z = kvm_mmio_find_zone(kvm, zone-addr, zone-size);
+   if (z == NULL)
+   return -EINVAL;
+
+   if (z-nb_excluded_zones = KVM_MAX_EXCLUDED_MMIO_ZONE)
+   return -ENOMEM;
+
+   if (kvm_mmio_find_excluded(z, zone-addr, 1) ||
+   kvm_mmio_find_excluded(z, zone-addr + zone-size - 1, 1))
+   return 0;
+
+   z-excluded[z-nb_excluded_zones].offset = zone-addr - z-addr;
+   z-excluded[z-nb_excluded_zones].size = zone-size;
+   z-nb_excluded_zones++;
+
+   return 0;
+}
+
 long kvm_arch_vm_ioctl(struct file *filp,
   unsigned int ioctl, unsigned long arg)
 {
@@ -1671,6 +1768,18 @@ long kvm_arch_vm_ioctl(struct file *filp,
r = 0;
break;
}
+   case KVM_SET_MMIO: {
+   struct kvm_mmio_zone zone;
+   r = -EFAULT;
+   if (copy_from_user(zone, argp, sizeof zone))
+   goto out;
+   r = -ENXIO;
+   r = kvm_vm_ioctl_set_mmio(kvm, zone);
+   if (r)
+   goto out;
+   r = 0;
+   break;
+   }
default:
;
}
@@ -2706,6 +2815,52 @@ static void vapic_exit(struct kvm_vcpu *vcpu)
mark_page_dirty(vcpu-kvm, apic-vapic_addr  PAGE_SHIFT);
 }
 
+static int batch_mmio(struct kvm_vcpu *vcpu)
+{
+   struct kvm_batch *batch = vcpu-kvm-arch.batch;
+   spinlock_t *lock = vcpu-kvm-arch.batch_lock;
+   int next;
+
+   /* check if this MMIO can be delayed */
+
+   if (!kvm_is_delayed_mmio(vcpu-kvm,
+vcpu-mmio_phys_addr, vcpu-mmio_size))
+  

Re: [kvm-devel] [ANNOUNCE] kvm-guest-drivers-windows-2

2008-05-15 Thread Anthony Liguori
Avi Kivity wrote:
 Anthony Liguori wrote:
 FWIW, virtio-net is much better with my patches applied.

 The can_receive patches?

 Again, I'm not opposed to them in principle, I just think that if they 
 help that this points at a virtio deficiency.  Virtio should never 
 leave the rx queue empty.  Consider the case where the virtio queue 
 isn't tied to a socket buffer, but directly to hardware.

For RX performance:


right now
[  3]  0.0-10.0 sec  1016 MBytes852 Mbits/sec

revert tap hack
[  3]  0.0-10.0 sec564 MBytes473 Mbits/sec

all patches applied
[  3]  0.0-10.0 sec  1.17 GBytes  1.01 Gbits/sec

drop lots of packets
[  3]  0.0-10.0 sec  1.05 GBytes905 Mbits/sec


The last patch is not in my series but it basically makes the ring size 
512 and drops packets when we run out of descriptors.  That was to valid 
that we're not hiding a virtio deficiency.  The reason I want to buffer 
packets is that it avoids having to deal with tuning.   For 
vringfd/vmdq, we'll have to make sure to get the tuning right though.

Regards,

Anthony Liguori

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 0/2][RFC][v2] Batch writes to MMIO

2008-05-15 Thread Laurent Vivier

These two patches allow to batch writes to MMIO.

When kernel has to send MMIO writes to userspace, it stores them
in memory until it has to pass the hand to userspace for another
reason. This avoids to have too many context switches on operations
that can wait.

These patches introduce an ioctl() to define MMIO allowed to be delayed.

I made some bentchmark with iperf and e1000:

average on 10 runs

WITHWITHOUT
PATCH   PATCH

257.2 MB/s  193.7 MB/s  33% faster

I've measured host_state_reload on WinXP boot:

WITHWITHOUT
PATCH   PATCH

561397  739708  24% less

I've measured host_state_reload on a VGA text scroll:

WITHWITHOUT
PATCH   PATCH

3976242 1377984970% less...

[PATCH 1/2] kvm: Batch writes to MMIO
- kernel part

[PATCH 2/2] kvm-userspace: Batch writes to MMIO
- userspace part

Signed-off-by: Laurent Vivier [EMAIL PROTECTED]

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 2/2][RFC][v2] kvm-userspace: Batch writes to MMIO

2008-05-15 Thread Laurent Vivier
This patch is userspace part of the batch writes to MMIO patch.

It defines delayed MMIO zone using kvm_set_mmio() (for VGA and e1000).
It empties the ring buffer and process the MMIO accesses.

Signed-off-by: Laurent Vivier [EMAIL PROTECTED]
---
 libkvm/libkvm-x86.c  |   18 ++
 libkvm/libkvm.c  |   13 +
 libkvm/libkvm.h  |2 ++
 qemu/hw/cirrus_vga.c |2 ++
 qemu/hw/e1000.c  |8 
 qemu/hw/vga.c|4 
 qemu/qemu-kvm.c  |6 ++
 qemu/qemu-kvm.h  |2 ++
 8 files changed, 55 insertions(+), 0 deletions(-)

diff --git a/libkvm/libkvm-x86.c b/libkvm/libkvm-x86.c
index d46fdcc..911e079 100644
--- a/libkvm/libkvm-x86.c
+++ b/libkvm/libkvm-x86.c
@@ -391,6 +391,24 @@ int kvm_set_pit(kvm_context_t kvm, struct kvm_pit_state *s)
 
 #endif
 
+int kvm_set_mmio(kvm_context_t kvm,
+uint8_t is_delayed, uint64_t addr, uint32_t size)
+{
+   struct kvm_mmio_zone zone;
+   int r;
+
+   zone.is_delayed = is_delayed;
+   zone.addr = addr;
+   zone.size = size;
+
+   r = ioctl(kvm-vm_fd, KVM_SET_MMIO, zone);
+   if (r == -1) {
+   r = -errno;
+   perror(kvm_set_mmio);
+   }
+   return r;
+}
+
 void kvm_show_code(kvm_context_t kvm, int vcpu)
 {
 #define SHOW_CODE_LEN 50
diff --git a/libkvm/libkvm.c b/libkvm/libkvm.c
index d1e95a4..b891630 100644
--- a/libkvm/libkvm.c
+++ b/libkvm/libkvm.c
@@ -861,6 +861,9 @@ int kvm_run(kvm_context_t kvm, int vcpu)
int r;
int fd = kvm-vcpu_fd[vcpu];
struct kvm_run *run = kvm-run[vcpu];
+#if defined(__x86_64__) || defined(__i386__)
+   struct kvm_batch *batch = (void *)run + 2 * PAGE_SIZE;
+#endif
 
 again:
if (!kvm-irqchip_in_kernel)
@@ -879,6 +882,16 @@ again:
 
post_kvm_run(kvm, vcpu);
 
+#if defined(__x86_64__) || defined(__i386__)
+   while (batch-first != batch-last) {
+   kvm-callbacks-mmio_write(kvm-opaque,
+  batch-mmio[batch-first].phys_addr,
+  batch-mmio[batch-first].data[0],
+  batch-mmio[batch-first].len);
+   batch-first = (batch-first + 1) % KVM_MAX_BATCH;
+   }
+#endif
+
if (r == -1) {
r = handle_io_window(kvm);
goto more;
diff --git a/libkvm/libkvm.h b/libkvm/libkvm.h
index 31c0d59..1f453e1 100644
--- a/libkvm/libkvm.h
+++ b/libkvm/libkvm.h
@@ -448,6 +448,8 @@ int kvm_get_dirty_pages_range(kvm_context_t kvm, unsigned 
long phys_addr,
  unsigned long end_addr, void *buf, void*opaque,
  int (*cb)(unsigned long start, unsigned long len,
void*bitmap, void *opaque));
+int kvm_set_mmio(kvm_context_t kvm,
+uint8_t is_delayed, uint64_t addr, uint32_t size);
 
 /*!
  * \brief Create a memory alias
diff --git a/qemu/hw/cirrus_vga.c b/qemu/hw/cirrus_vga.c
index 2c4aeec..4ef8085 100644
--- a/qemu/hw/cirrus_vga.c
+++ b/qemu/hw/cirrus_vga.c
@@ -3291,6 +3291,8 @@ static void cirrus_init_common(CirrusVGAState * s, int 
device_id, int is_pci)
cirrus_vga_mem_write, s);
 cpu_register_physical_memory(isa_mem_base + 0x000a, 0x2,
  vga_io_memory);
+if (kvm_enabled())
+qemu_kvm_set_mmio(1, isa_mem_base + 0x000a, 0x2);
 
 s-sr[0x06] = 0x0f;
 if (device_id == CIRRUS_ID_CLGD5446) {
diff --git a/qemu/hw/e1000.c b/qemu/hw/e1000.c
index 0728539..d223631 100644
--- a/qemu/hw/e1000.c
+++ b/qemu/hw/e1000.c
@@ -26,6 +26,7 @@
 #include hw.h
 #include pci.h
 #include net.h
+#include qemu-kvm.h
 
 #include e1000_hw.h
 
@@ -938,6 +939,13 @@ e1000_mmio_map(PCIDevice *pci_dev, int region_num,
 
 d-mmio_base = addr;
 cpu_register_physical_memory(addr, PNPMMIO_SIZE, d-mmio_index);
+
+if (kvm_enabled()) {
+qemu_kvm_set_mmio(1, addr, PNPMMIO_SIZE);
+qemu_kvm_set_mmio(0, addr + E1000_TCTL, 4);
+qemu_kvm_set_mmio(0, addr + E1000_TDT, 4);
+qemu_kvm_set_mmio(0, addr + E1000_ICR, 4);
+}
 }
 
 static int
diff --git a/qemu/hw/vga.c b/qemu/hw/vga.c
index 3a49573..844c2a7 100644
--- a/qemu/hw/vga.c
+++ b/qemu/hw/vga.c
@@ -2257,6 +2257,8 @@ void vga_init(VGAState *s)
 vga_io_memory = cpu_register_io_memory(0, vga_mem_read, vga_mem_write, s);
 cpu_register_physical_memory(isa_mem_base + 0x000a, 0x2,
  vga_io_memory);
+if (kvm_enabled())
+   qemu_kvm_set_mmio(1, isa_mem_base + 0x000a, 0x2);
 }
 
 /* Memory mapped interface */
@@ -2332,6 +2334,8 @@ static void vga_mm_init(VGAState *s, target_phys_addr_t 
vram_base,
 cpu_register_physical_memory(ctrl_base, 0x10, s_ioport_ctrl);
 s-bank_offset = 0;
 cpu_register_physical_memory(vram_base + 0x000a, 0x2, 
vga_io_memory);
+if 

[kvm-devel] [PATCH 0/13] New shot at QEMUAccel

2008-05-15 Thread Glauber Costa
Hi guys,

This is a new version of the QEMUAccel work. To start with, I decided
to keep the name for now. We don't have that many functions that are not
cpu-related to justify splitting the structure into many. Plus, this is one
of the less confusing names we came up with.

The code I'm posting is tested with kqemu for both i386 and x86_64, and it 
works.
So, if you guys feel like it, I can say it's ready  for inclusion (which 
obviously
does not mean bug-free). It is not complete, however.

There are still some pieces of kqemu code that does not work. Specially the 
interrupt
code in cpu-exec.c , which relies on the tricky longjmp.

Comments are very welcome.



-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 03/13] [PATCH] introduce QEMUAccel and fill it with interrupt specific driver

2008-05-15 Thread Glauber Costa
This patch introduces QEMUAccel, a placeholder for function pointers
that aims at helping qemu to abstract accelerators such as kqemu and
kvm (actually, the 'accelerator' name was proposed by avi kivity, since
he loves referring to kvm that way).

To begin with, the accelerator is given the opportunity to register a
cpu_interrupt function, to be called after the raw cpu_interrupt.
This has the side effect of, for the kqemu accelerator, calling 
kqemu_cpu_interrupt
everytime, which didn't use to happen. But looking at the code, this seems safe 
to me.

This patch applies on raw qemu.
---
 block-raw-posix.c |5 -
 exec-all.h|   18 +-
 exec.c|2 ++
 kqemu.c   |   27 +--
 vl.c  |6 +-
 5 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/block-raw-posix.c b/block-raw-posix.c
index 6b0009e..61c23ba 100644
--- a/block-raw-posix.c
+++ b/block-raw-posix.c
@@ -250,11 +250,6 @@ static void aio_signal_handler(int signum)
 if (env) {
 /* stop the currently executing cpu because a timer occured */
 cpu_interrupt(env, CPU_INTERRUPT_EXIT);
-#ifdef USE_KQEMU
-if (env-kqemu_enabled) {
-kqemu_cpu_interrupt(env);
-}
-#endif
 }
 #endif
 }
diff --git a/exec-all.h b/exec-all.h
index 8c32858..7b2d97d 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -578,6 +578,23 @@ static inline target_ulong get_phys_addr_code(CPUState 
*env1, target_ulong addr)
 }
 #endif
 
+typedef struct QEMUAccel {
+void (*cpu_interrupt)(CPUState *env);
+} QEMUAccel;
+
+extern QEMUAccel *current_accel;
+
+static inline void register_qemu_accel(QEMUAccel *accel)
+{
+current_accel = accel;
+}
+
+static inline void accel_cpu_interrupt(CPUState *env)
+{
+if (current_accel  current_accel-cpu_interrupt)
+current_accel-cpu_interrupt(env);
+}
+
 #ifdef USE_KQEMU
 #define KQEMU_MODIFY_PAGE_MASK (0xff  ~(VGA_DIRTY_FLAG | CODE_DIRTY_FLAG))
 
@@ -587,7 +604,6 @@ void kqemu_flush_page(CPUState *env, target_ulong addr);
 void kqemu_flush(CPUState *env, int global);
 void kqemu_set_notdirty(CPUState *env, ram_addr_t ram_addr);
 void kqemu_modify_page(CPUState *env, ram_addr_t ram_addr);
-void kqemu_cpu_interrupt(CPUState *env);
 void kqemu_record_dump(void);
 
 static inline int kqemu_is_ok(CPUState *env)
diff --git a/exec.c b/exec.c
index dfedfc3..73360d3 100644
--- a/exec.c
+++ b/exec.c
@@ -1256,6 +1256,8 @@ void cpu_interrupt(CPUState *env, int mask)
 tb_reset_jump_recursive(tb);
 resetlock(interrupt_lock);
 }
+
+accel_cpu_interrupt(env);
 }
 
 void cpu_reset_interrupt(CPUState *env, int mask)
diff --git a/kqemu.c b/kqemu.c
index 0e38d52..f875e0e 100644
--- a/kqemu.c
+++ b/kqemu.c
@@ -159,6 +159,8 @@ static void kqemu_update_cpuid(CPUState *env)
accelerated code */
 }
 
+QEMUAccel kqemu_accel;
+
 int kqemu_start(void)
 {
 struct kqemu_init init;
@@ -240,6 +242,7 @@ int kqemu_start(void)
 }
 nb_pages_to_flush = 0;
 nb_ram_pages_to_update = 0;
+register_qemu_accel(kqemu_accel);
 return 0;
 }
 
@@ -249,6 +252,20 @@ void kqemu_init_env(CPUState *env)
 env-kqemu_enabled = kqemu_allowed;
 }
 
+void kqemu_cpu_interrupt(CPUState *env)
+{
+#if defined(_WIN32)  KQEMU_VERSION = 0x010101
+/* cancelling the I/O request causes KQEMU to finish executing the
+   current block and successfully returning. */
+CancelIo(kqemu_fd);
+#endif
+}
+
+QEMUAccel kqemu_accel = {
+.cpu_interrupt = kqemu_cpu_interrupt,
+};
+
+
 void kqemu_flush_page(CPUState *env, target_ulong addr)
 {
 #if defined(DEBUG)
@@ -906,14 +923,4 @@ int kqemu_cpu_exec(CPUState *env)
 }
 return 0;
 }
-
-void kqemu_cpu_interrupt(CPUState *env)
-{
-#if defined(_WIN32)  KQEMU_VERSION = 0x010101
-/* cancelling the I/O request causes KQEMU to finish executing the
-   current block and successfully returning. */
-CancelIo(kqemu_fd);
-#endif
-}
-
 #endif
diff --git a/vl.c b/vl.c
index 5999b37..26c1677 100644
--- a/vl.c
+++ b/vl.c
@@ -239,6 +239,7 @@ struct drive_opt {
 static CPUState *cur_cpu;
 static CPUState *next_cpu;
 static int event_pending = 1;
+QEMUAccel *current_accel;
 
 #define TFR(expr) do { if ((expr) != -1) break; } while (errno == EINTR)
 
@@ -1199,11 +1200,6 @@ static void host_alarm_handler(int host_signum)
 if (env) {
 /* stop the currently executing cpu because a timer occured */
 cpu_interrupt(env, CPU_INTERRUPT_EXIT);
-#ifdef USE_KQEMU
-if (env-kqemu_enabled) {
-kqemu_cpu_interrupt(env);
-}
-#endif
 }
 event_pending = 1;
 }
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list

Re: [kvm-devel] pinning, tsc and apic

2008-05-15 Thread Anthony Liguori
Chris Wright wrote:
 * Anthony Liguori ([EMAIL PROTECTED]) wrote:
   
  From a quick look, I suspect that the number of wildly off TSC 
 calibrations correspond to the VMs that are misbehaving.  I think this 
 may mean that we have to re-examine the tsc delta computation.

 10_serial.log:time.c: Detected 1995.038 MHz processor.
 11_serial.log:time.c: Detected 2363.195 MHz processor.
 12_serial.log:time.c: Detected 2492.675 MHz processor.
 13_serial.log:time.c: Detected 1995.061 MHz processor.
 14_serial.log:time.c: Detected 1994.917 MHz processor.
 15_serial.log:time.c: Detected 4100.735 MHz processor.
 16_serial.log:time.c: Detected 2075.800 MHz processor.
 17_serial.log:time.c: Detected 2674.350 MHz processor.
 18_serial.log:time.c: Detected 1995.002 MHz processor.
 19_serial.log:time.c: Detected 1994.978 MHz processor.
 1_serial.log:time.c: Detected 4384.310 MHz processor.
 

 Is this with pinning?  We at least know we're losing small bits on
 migration.  From my measurements it's ~3000 (outliers are 10-20k).

 Also, what happens if you roll back to kvm-userspace 7f5c4d15ece5?

 I'm using this:

 diff -up arch/x86/kvm/svm.c~svm arch/x86/kvm/svm.c
 --- arch/x86/kvm/svm.c~svm2008-04-16 19:49:44.0 -0700
 +++ arch/x86/kvm/svm.c2008-05-14 23:44:18.0 -0700
 @@ -621,6 +621,13 @@ static void svm_free_vcpu(struct kvm_vcp
   kmem_cache_free(kvm_vcpu_cache, svm);
  }
  
 +static void svm_tsc_update(void *arg)
 +{
 + struct vcpu_svm *svm = arg;
 + rdtscll(svm-vcpu.arch.host_tsc);
 +
 +}
 +
  static void svm_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
  {
   struct vcpu_svm *svm = to_svm(vcpu);
 @@ -633,6 +640,9 @@ static void svm_vcpu_load(struct kvm_vcp
* Make sure that the guest sees a monotonically
* increasing TSC.
*/
 + if (vcpu-cpu != -1)
 + smp_call_function_single(vcpu-cpu, svm_tsc_update,
 +  svm, 0, 1);
   

I like this approach because of its simplicity although the IPI is not 
wonderful.  I was also thinking of using cpu_clock() to take a timestamp 
on vcpu_put, then on vcpu_load, take another timestamp and use the 
cyc2ns conversion to try and estimate the elapsed tsc ticks on the new cpu.

Regards,

Anthony Liguori

   rdtscll(tsc_this);
   delta = vcpu-arch.host_tsc - tsc_this;
   svm-vmcb-control.tsc_offset += delta;

   


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 02/13] [PATCH] split kqemu_init into two

2008-05-15 Thread Glauber Costa
we separate kqemu_init() into a part that depends on env,
and other that does not. The later can be initialized earlier
---
 exec.c|3 +++
 kqemu.c   |   10 +++---
 target-i386/helper2.c |2 +-
 3 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/exec.c b/exec.c
index 5384460..dfedfc3 100644
--- a/exec.c
+++ b/exec.c
@@ -334,6 +334,9 @@ void exec_init(void)
 code_gen_ptr = code_gen_buffer;
 page_init();
 io_mem_init();
+#ifdef USE_KQEMU
+kqemu_start();
+#endif
 }
 
 void cpu_exec_init(CPUState *env)
diff --git a/kqemu.c b/kqemu.c
index 88592ee..0e38d52 100644
--- a/kqemu.c
+++ b/kqemu.c
@@ -159,7 +159,7 @@ static void kqemu_update_cpuid(CPUState *env)
accelerated code */
 }
 
-int kqemu_init(CPUState *env)
+int kqemu_start(void)
 {
 struct kqemu_init init;
 int ret, version;
@@ -238,13 +238,17 @@ int kqemu_init(CPUState *env)
 kqemu_fd = KQEMU_INVALID_FD;
 return -1;
 }
-kqemu_update_cpuid(env);
-env-kqemu_enabled = kqemu_allowed;
 nb_pages_to_flush = 0;
 nb_ram_pages_to_update = 0;
 return 0;
 }
 
+void kqemu_init_env(CPUState *env)
+{
+kqemu_update_cpuid(env);
+env-kqemu_enabled = kqemu_allowed;
+}
+
 void kqemu_flush_page(CPUState *env, target_ulong addr)
 {
 #if defined(DEBUG)
diff --git a/target-i386/helper2.c b/target-i386/helper2.c
index 6cf218f..1c0fcdb 100644
--- a/target-i386/helper2.c
+++ b/target-i386/helper2.c
@@ -113,7 +113,7 @@ CPUX86State *cpu_x86_init(const char *cpu_model)
 }
 cpu_reset(env);
 #ifdef USE_KQEMU
-kqemu_init(env);
+kqemu_init_env(env);
 #endif
 return env;
 }
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 01/13] [PATCH] make cpu_exec_init symmetric

2008-05-15 Thread Glauber Costa
we put all the code that needs to be executed only at cpu0
out of cpu_exec_init(), in exec_init(). It is executed
before machine_init(), and only once. With this change,
code cpu_exec_init() is completely symmetric.
---
 exec-all.h |1 +
 exec.c |   15 +--
 vl.c   |1 +
 3 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index d8c6c33..8c32858 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -82,6 +82,7 @@ int cpu_restore_state_copy(struct TranslationBlock *tb,
void *puc);
 void cpu_resume_from_signal(CPUState *env1, void *puc);
 void cpu_exec_init(CPUState *env);
+void exec_init(void);
 int page_unprotect(target_ulong address, unsigned long pc, void *puc);
 void tb_invalidate_phys_page_range(target_phys_addr_t start, 
target_phys_addr_t end,
int is_cpu_write_access);
diff --git a/exec.c b/exec.c
index 2fd0078..5384460 100644
--- a/exec.c
+++ b/exec.c
@@ -327,17 +327,20 @@ static void tlb_unprotect_code_phys(CPUState *env, 
ram_addr_t ram_addr,
 target_ulong vaddr);
 #endif
 
+/* Must be called once before any of attempts to call cpu_init */
+void exec_init(void)
+{
+cpu_gen_init();
+code_gen_ptr = code_gen_buffer;
+page_init();
+io_mem_init();
+}
+
 void cpu_exec_init(CPUState *env)
 {
 CPUState **penv;
 int cpu_index;
 
-if (!code_gen_ptr) {
-cpu_gen_init();
-code_gen_ptr = code_gen_buffer;
-page_init();
-io_mem_init();
-}
 env-next_cpu = NULL;
 penv = first_cpu;
 cpu_index = 0;
diff --git a/vl.c b/vl.c
index 67712f0..5999b37 100644
--- a/vl.c
+++ b/vl.c
@@ -8576,6 +8576,7 @@ int main(int argc, char **argv)
 }
 }
 
+exec_init();
 machine-init(ram_size, vga_ram_size, boot_devices, ds,
   kernel_filename, kernel_cmdline, initrd_filename, cpu_model);
 
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 04/13] [PATCH] init env made accel driver

2008-05-15 Thread Glauber Costa
---
 exec-all.h|8 +++-
 kqemu.c   |1 +
 target-i386/helper2.c |4 +---
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index 7b2d97d..9e211dc 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -580,6 +580,7 @@ static inline target_ulong get_phys_addr_code(CPUState 
*env1, target_ulong addr)
 
 typedef struct QEMUAccel {
 void (*cpu_interrupt)(CPUState *env);
+void (*init_env)(CPUState *env);
 } QEMUAccel;
 
 extern QEMUAccel *current_accel;
@@ -595,10 +596,15 @@ static inline void accel_cpu_interrupt(CPUState *env)
 current_accel-cpu_interrupt(env);
 }
 
+static inline void accel_init_env(CPUState *env)
+{
+if (current_accel  current_accel-init_env)
+current_accel-init_env(env);
+}
+
 #ifdef USE_KQEMU
 #define KQEMU_MODIFY_PAGE_MASK (0xff  ~(VGA_DIRTY_FLAG | CODE_DIRTY_FLAG))
 
-int kqemu_init(CPUState *env);
 int kqemu_cpu_exec(CPUState *env);
 void kqemu_flush_page(CPUState *env, target_ulong addr);
 void kqemu_flush(CPUState *env, int global);
diff --git a/kqemu.c b/kqemu.c
index f875e0e..e0422de 100644
--- a/kqemu.c
+++ b/kqemu.c
@@ -263,6 +263,7 @@ void kqemu_cpu_interrupt(CPUState *env)
 
 QEMUAccel kqemu_accel = {
 .cpu_interrupt = kqemu_cpu_interrupt,
+.init_env = kqemu_init_env,
 };
 
 
diff --git a/target-i386/helper2.c b/target-i386/helper2.c
index 1c0fcdb..b633814 100644
--- a/target-i386/helper2.c
+++ b/target-i386/helper2.c
@@ -112,9 +112,7 @@ CPUX86State *cpu_x86_init(const char *cpu_model)
 return NULL;
 }
 cpu_reset(env);
-#ifdef USE_KQEMU
-kqemu_init_env(env);
-#endif
+accel_init_env(env);
 return env;
 }
 
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 11/13] [PATCH] wrap modify_page through accel calls

2008-05-15 Thread Glauber Costa
---
 exec-all.h |8 +++-
 exec.c |   24 +---
 kqemu.c|   26 +++---
 3 files changed, 31 insertions(+), 27 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index ed96a22..04112e0 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -586,6 +586,7 @@ typedef struct QEMUAccel {
 int (*info)(CPUState *env, char *buf);
 int (*profile)(CPUState *env, char *buf);
 void (*set_notdirty)(ram_addr_t addr);
+void (*modify_page)(ram_addr_t addr, int dirty_flags);
 } QEMUAccel;
 
 extern QEMUAccel *current_accel;
@@ -639,11 +640,16 @@ static inline void accel_set_notdirty(target_ulong addr)
 current_accel-set_notdirty(addr);
 }
 
+static inline void accel_modify_page(target_ulong addr, int dirty_flags)
+{
+if (current_accel  current_accel-modify_page)
+current_accel-modify_page(addr, dirty_flags);
+}
+
 #ifdef USE_KQEMU
 #define KQEMU_MODIFY_PAGE_MASK (0xff  ~(VGA_DIRTY_FLAG | CODE_DIRTY_FLAG))
 
 int kqemu_cpu_exec(CPUState *env);
-void kqemu_modify_page(CPUState *env, ram_addr_t ram_addr);
 void kqemu_record_dump(void);
 
 static inline int kqemu_is_ok(CPUState *env)
diff --git a/exec.c b/exec.c
index 6d05f75..92f1552 100644
--- a/exec.c
+++ b/exec.c
@@ -2185,11 +2185,9 @@ static void notdirty_mem_writeb(void *opaque, 
target_phys_addr_t addr, uint32_t
 #endif
 }
 stb_p((uint8_t *)(long)addr, val);
-#ifdef USE_KQEMU
-if (cpu_single_env-kqemu_enabled 
-(dirty_flags  KQEMU_MODIFY_PAGE_MASK) != KQEMU_MODIFY_PAGE_MASK)
-kqemu_modify_page(cpu_single_env, ram_addr);
-#endif
+
+accel_modify_page(ram_addr, dirty_flags);
+
 dirty_flags |= (0xff  ~CODE_DIRTY_FLAG);
 phys_ram_dirty[ram_addr  TARGET_PAGE_BITS] = dirty_flags;
 /* we remove the notdirty callback only if the code has been
@@ -2211,11 +2209,9 @@ static void notdirty_mem_writew(void *opaque, 
target_phys_addr_t addr, uint32_t
 #endif
 }
 stw_p((uint8_t *)(long)addr, val);
-#ifdef USE_KQEMU
-if (cpu_single_env-kqemu_enabled 
-(dirty_flags  KQEMU_MODIFY_PAGE_MASK) != KQEMU_MODIFY_PAGE_MASK)
-kqemu_modify_page(cpu_single_env, ram_addr);
-#endif
+
+accel_modify_page(ram_addr, dirty_flags);
+
 dirty_flags |= (0xff  ~CODE_DIRTY_FLAG);
 phys_ram_dirty[ram_addr  TARGET_PAGE_BITS] = dirty_flags;
 /* we remove the notdirty callback only if the code has been
@@ -2237,11 +2233,9 @@ static void notdirty_mem_writel(void *opaque, 
target_phys_addr_t addr, uint32_t
 #endif
 }
 stl_p((uint8_t *)(long)addr, val);
-#ifdef USE_KQEMU
-if (cpu_single_env-kqemu_enabled 
-(dirty_flags  KQEMU_MODIFY_PAGE_MASK) != KQEMU_MODIFY_PAGE_MASK)
-kqemu_modify_page(cpu_single_env, ram_addr);
-#endif
+
+accel_modify_page(ram_addr, dirty_flags);
+
 dirty_flags |= (0xff  ~CODE_DIRTY_FLAG);
 phys_ram_dirty[ram_addr  TARGET_PAGE_BITS] = dirty_flags;
 /* we remove the notdirty callback only if the code has been
diff --git a/kqemu.c b/kqemu.c
index 44c1a55..7e24bb7 100644
--- a/kqemu.c
+++ b/kqemu.c
@@ -358,16 +358,6 @@ void kqemu_set_notdirty(ram_addr_t ram_addr)
 ram_pages_to_update[nb_ram_pages_to_update++] = ram_addr;
 }
 
-QEMUAccel kqemu_accel = {
-.cpu_interrupt = kqemu_cpu_interrupt,
-.init_env = kqemu_init_env,
-.flush_cache = kqemu_flush,
-.flush_page = kqemu_flush_page,
-.info = kqemu_info,
-.profile = kqemu_profile,
-.set_notdirty = kqemu_set_notdirty,
-};
-
 static void kqemu_reset_modified_ram_pages(void)
 {
 int i;
@@ -380,7 +370,7 @@ static void kqemu_reset_modified_ram_pages(void)
 nb_modified_ram_pages = 0;
 }
 
-void kqemu_modify_page(CPUState *env, ram_addr_t ram_addr)
+void kqemu_modify_page(ram_addr_t ram_addr, int dirty_flags)
 {
 unsigned long page_index;
 int ret;
@@ -388,6 +378,8 @@ void kqemu_modify_page(CPUState *env, ram_addr_t ram_addr)
 DWORD temp;
 #endif
 
+if ((dirty_flags  KQEMU_MODIFY_PAGE_MASK) != KQEMU_MODIFY_PAGE_MASK)
+return;
 page_index = ram_addr  TARGET_PAGE_BITS;
 if (!modified_ram_pages_table[page_index]) {
 #if 0
@@ -411,6 +403,18 @@ void kqemu_modify_page(CPUState *env, ram_addr_t ram_addr)
 }
 }
 
+QEMUAccel kqemu_accel = {
+.cpu_interrupt = kqemu_cpu_interrupt,
+.init_env = kqemu_init_env,
+.flush_cache = kqemu_flush,
+.flush_page = kqemu_flush_page,
+.info = kqemu_info,
+.profile = kqemu_profile,
+.set_notdirty = kqemu_set_notdirty,
+.modify_page = kqemu_modify_page,
+};
+
+
 struct fpstate {
 uint16_t fpuc;
 uint16_t dummy1;
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 07/13] [PATCH] separate accelerator part of info profiler

2008-05-15 Thread Glauber Costa
---
 exec-all.h |8 
 kqemu.c|   35 +++
 monitor.c  |   27 ++-
 3 files changed, 49 insertions(+), 21 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index f1bd7ae..689973d 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -584,6 +584,7 @@ typedef struct QEMUAccel {
 void (*flush_cache)(CPUState *env, int global);
 void (*flush_page)(CPUState *env, target_ulong addr);
 int (*info)(CPUState *env, char *buf);
+int (*profile)(CPUState *env, char *buf);
 } QEMUAccel;
 
 extern QEMUAccel *current_accel;
@@ -624,6 +625,13 @@ static inline int accel_info(CPUState *env, char *buf)
 return 0;
 }
 
+static inline int accel_profile(CPUState *env, char *buf)
+{
+if (current_accel  current_accel-profile)
+   return current_accel-profile(env, buf);
+return 0;
+}
+
 #ifdef USE_KQEMU
 #define KQEMU_MODIFY_PAGE_MASK (0xff  ~(VGA_DIRTY_FLAG | CODE_DIRTY_FLAG))
 
diff --git a/kqemu.c b/kqemu.c
index 451d1d4..6d46dfb 100644
--- a/kqemu.c
+++ b/kqemu.c
@@ -51,6 +51,10 @@
 #include fcntl.h
 #include kqemu.h
 
+#ifdef CONFIG_PROFILER
+#include qemu-timer.h /* for ticks_per_sec */
+#endif
+
 /* compatibility stuff */
 #ifndef KQEMU_RET_SYSCALL
 #define KQEMU_RET_SYSCALL   0x0300 /* syscall insn */
@@ -307,12 +311,43 @@ int kqemu_info(CPUState *env, char *buf)
 return len;
 }
 
+int64_t kqemu_time;
+int64_t kqemu_exec_count;
+int64_t kqemu_ret_int_count;
+int64_t kqemu_ret_excp_count;
+int64_t kqemu_ret_intr_count;
+extern int64_t qemu_time;
+
+int kqemu_profile(CPUState *env, char *buf)
+{
+int len = 0;
+#ifdef CONFIG_PROFILER
+len = sprintf(buf, kqemu time  % PRId64  (%0.3f %0.1f%%) count=% PRId64
+ int=% PRId64  excp=% PRId64  intr=% PRId64 \n,
+kqemu_time, kqemu_time / (double)ticks_per_sec,
+kqemu_time / qemu_time * 100.0,
+kqemu_exec_count,
+kqemu_ret_int_count,
+kqemu_ret_excp_count,
+kqemu_ret_intr_count);
+
+kqemu_time = 0;
+kqemu_exec_count = 0;
+kqemu_ret_int_count = 0;
+kqemu_ret_excp_count = 0;
+kqemu_ret_intr_count = 0;
+kqemu_record_dump();
+#endif
+return len;
+}
+
 QEMUAccel kqemu_accel = {
 .cpu_interrupt = kqemu_cpu_interrupt,
 .init_env = kqemu_init_env,
 .flush_cache = kqemu_flush,
 .flush_page = kqemu_flush_page,
 .info = kqemu_info,
+.profile = kqemu_profile,
 };
 
 
diff --git a/monitor.c b/monitor.c
index cb9faef..2ee5b0c 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1187,17 +1187,14 @@ static void do_info_accelerator(void)
 
 #ifdef CONFIG_PROFILER
 
-int64_t kqemu_time;
 int64_t qemu_time;
-int64_t kqemu_exec_count;
 int64_t dev_time;
-int64_t kqemu_ret_int_count;
-int64_t kqemu_ret_excp_count;
-int64_t kqemu_ret_intr_count;
-
 static void do_info_profile(void)
 {
 int64_t total;
+char buf[MAX_BUF];
+CPUState *env = mon_get_cpu();
+
 total = qemu_time;
 if (total == 0)
 total = 1;
@@ -1205,24 +1202,12 @@ static void do_info_profile(void)
 dev_time, dev_time / (double)ticks_per_sec);
 term_printf(qemu time   % PRId64  (%0.3f)\n,
 qemu_time, qemu_time / (double)ticks_per_sec);
-term_printf(kqemu time  % PRId64  (%0.3f %0.1f%%) count=% PRId64  
int=% PRId64  excp=% PRId64  intr=% PRId64 \n,
-kqemu_time, kqemu_time / (double)ticks_per_sec,
-kqemu_time / (double)total * 100.0,
-kqemu_exec_count,
-kqemu_ret_int_count,
-kqemu_ret_excp_count,
-kqemu_ret_intr_count);
+if (accel_profile(env, buf))
+term_printf(buf);
 qemu_time = 0;
-kqemu_time = 0;
-kqemu_exec_count = 0;
 dev_time = 0;
-kqemu_ret_int_count = 0;
-kqemu_ret_excp_count = 0;
-kqemu_ret_intr_count = 0;
-#ifdef USE_KQEMU
-kqemu_record_dump();
-#endif
 }
+
 #else
 static void do_info_profile(void)
 {
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 08/13] [PATCH] move kqemu externs to kqemu.h

2008-05-15 Thread Glauber Costa
---
 cpu-all.h |5 -
 kqemu.h   |6 ++
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 7e77f76..5336a29 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -1053,14 +1053,9 @@ static inline int64_t profile_getclock(void)
 return cpu_get_real_ticks();
 }
 
-extern int64_t kqemu_time, kqemu_time_start;
 extern int64_t qemu_time, qemu_time_start;
 extern int64_t tlb_flush_time;
-extern int64_t kqemu_exec_count;
 extern int64_t dev_time;
-extern int64_t kqemu_ret_int_count;
-extern int64_t kqemu_ret_excp_count;
-extern int64_t kqemu_ret_intr_count;
 
 extern int64_t dyngen_tb_count1;
 extern int64_t dyngen_tb_count;
diff --git a/kqemu.h b/kqemu.h
index 7b43057..88156c1 100644
--- a/kqemu.h
+++ b/kqemu.h
@@ -26,6 +26,12 @@
 
 #define KQEMU_VERSION 0x010300
 
+extern int64_t kqemu_time, kqemu_time_start;
+extern int64_t kqemu_exec_count;
+extern int64_t kqemu_ret_int_count;
+extern int64_t kqemu_ret_excp_count;
+extern int64_t kqemu_ret_intr_count;
+
 struct kqemu_segment_cache {
 uint32_t selector;
 unsigned long base;
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 06/13] [PATCH] turn info kqemu into generic info accelerator

2008-05-15 Thread Glauber Costa
---
 exec-all.h |8 
 kqemu.c|   24 
 monitor.c  |   36 +---
 3 files changed, 45 insertions(+), 23 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index bfc6576..f1bd7ae 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -583,6 +583,7 @@ typedef struct QEMUAccel {
 void (*init_env)(CPUState *env);
 void (*flush_cache)(CPUState *env, int global);
 void (*flush_page)(CPUState *env, target_ulong addr);
+int (*info)(CPUState *env, char *buf);
 } QEMUAccel;
 
 extern QEMUAccel *current_accel;
@@ -616,6 +617,13 @@ static inline void accel_flush_page(CPUState *env, 
target_ulong addr)
 current_accel-flush_page(env, addr);
 }
 
+static inline int accel_info(CPUState *env, char *buf)
+{
+if (current_accel  current_accel-info)
+return current_accel-info(env, buf);
+return 0;
+}
+
 #ifdef USE_KQEMU
 #define KQEMU_MODIFY_PAGE_MASK (0xff  ~(VGA_DIRTY_FLAG | CODE_DIRTY_FLAG))
 
diff --git a/kqemu.c b/kqemu.c
index 524c74d..451d1d4 100644
--- a/kqemu.c
+++ b/kqemu.c
@@ -284,11 +284,35 @@ void kqemu_flush(CPUState *env, int global)
 nb_pages_to_flush = KQEMU_FLUSH_ALL;
 }
 
+int kqemu_info(CPUState *env, char *buf)
+{
+int val, len;
+val = 0;
+val = env-kqemu_enabled;
+len = sprintf(buf, kqemu support: );
+buf += len;
+
+switch(val) {
+default:
+len += sprintf(buf, present, but bogus value\n);
+break;
+case 1:
+len += sprintf(buf, enabled for user code\n);
+break;
+case 2:
+len += sprintf(buf, enabled for user and kernel code\n);
+break;
+}
+
+return len;
+}
+
 QEMUAccel kqemu_accel = {
 .cpu_interrupt = kqemu_cpu_interrupt,
 .init_env = kqemu_init_env,
 .flush_cache = kqemu_flush,
 .flush_page = kqemu_flush_page,
+.info = kqemu_info,
 };
 
 
diff --git a/monitor.c b/monitor.c
index 236b827..cb9faef 100644
--- a/monitor.c
+++ b/monitor.c
@@ -34,6 +34,7 @@
 #include block.h
 #include audio/audio.h
 #include disas.h
+#include exec-all.h
 #include dirent.h
 
 #ifdef CONFIG_PROFILER
@@ -1165,34 +1166,23 @@ static void mem_info(void)
 }
 #endif
 
-static void do_info_kqemu(void)
+#define MAX_BUF 1024
+static void do_info_accelerator(void)
 {
-#ifdef USE_KQEMU
+char buf[MAX_BUF];
 CPUState *env;
-int val;
-val = 0;
+
 env = mon_get_cpu();
+
 if (!env) {
 term_printf(No cpu initialized yet);
 return;
 }
-val = env-kqemu_enabled;
-term_printf(kqemu support: );
-switch(val) {
-default:
-case 0:
-term_printf(disabled\n);
-break;
-case 1:
-term_printf(enabled for user code\n);
-break;
-case 2:
-term_printf(enabled for user and kernel code\n);
-break;
-}
-#else
-term_printf(kqemu support: not compiled\n);
-#endif
+
+if (accel_info(env, buf))
+term_printf(buf);
+else
+term_printf(No accelerator present\n);
 }
 
 #ifdef CONFIG_PROFILER
@@ -1422,8 +1412,8 @@ static term_cmd_t info_cmds[] = {
 #endif
 { jit, , do_info_jit,
   , show dynamic compiler info, },
-{ kqemu, , do_info_kqemu,
-  , show kqemu information, },
+{ accelerator, , do_info_accelerator,
+  , show accelerator information, },
 { usb, , usb_info,
   , show guest USB devices, },
 { usbhost, , usb_host_info,
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 10/13] [PATCH] set_notdirty goes through accel wrapper

2008-05-15 Thread Glauber Costa
---
 exec-all.h |8 +++-
 exec.c |   18 +++---
 kqemu.c|   23 +++
 3 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index 689973d..ed96a22 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -585,6 +585,7 @@ typedef struct QEMUAccel {
 void (*flush_page)(CPUState *env, target_ulong addr);
 int (*info)(CPUState *env, char *buf);
 int (*profile)(CPUState *env, char *buf);
+void (*set_notdirty)(ram_addr_t addr);
 } QEMUAccel;
 
 extern QEMUAccel *current_accel;
@@ -632,11 +633,16 @@ static inline int accel_profile(CPUState *env, char *buf)
 return 0;
 }
 
+static inline void accel_set_notdirty(target_ulong addr)
+{
+if (current_accel  current_accel-set_notdirty)
+current_accel-set_notdirty(addr);
+}
+
 #ifdef USE_KQEMU
 #define KQEMU_MODIFY_PAGE_MASK (0xff  ~(VGA_DIRTY_FLAG | CODE_DIRTY_FLAG))
 
 int kqemu_cpu_exec(CPUState *env);
-void kqemu_set_notdirty(CPUState *env, ram_addr_t ram_addr);
 void kqemu_modify_page(CPUState *env, ram_addr_t ram_addr);
 void kqemu_record_dump(void);
 
diff --git a/exec.c b/exec.c
index 5b093a3..6d05f75 100644
--- a/exec.c
+++ b/exec.c
@@ -1531,18 +1531,14 @@ void cpu_physical_memory_reset_dirty(ram_addr_t start, 
ram_addr_t end,
 if (length == 0)
 return;
 len = length  TARGET_PAGE_BITS;
-#ifdef USE_KQEMU
-/* XXX: should not depend on cpu context */
-env = first_cpu;
-if (env-kqemu_enabled) {
-ram_addr_t addr;
-addr = start;
-for(i = 0; i  len; i++) {
-kqemu_set_notdirty(env, addr);
-addr += TARGET_PAGE_SIZE;
-}
+
+ram_addr_t addr;
+addr = start;
+for(i = 0; i  len; i++) {
+accel_set_notdirty(addr);
+addr += TARGET_PAGE_SIZE;
 }
-#endif
+
 mask = ~dirty_flags;
 p = phys_ram_dirty + (start  TARGET_PAGE_BITS);
 for(i = 0; i  len; i++)
diff --git a/kqemu.c b/kqemu.c
index 94366ec..44c1a55 100644
--- a/kqemu.c
+++ b/kqemu.c
@@ -342,18 +342,7 @@ int kqemu_profile(CPUState *env, char *buf)
 return len;
 }
 
-QEMUAccel kqemu_accel = {
-.cpu_interrupt = kqemu_cpu_interrupt,
-.init_env = kqemu_init_env,
-.flush_cache = kqemu_flush,
-.flush_page = kqemu_flush_page,
-.info = kqemu_info,
-.profile = kqemu_profile,
-};
-
-
-
-void kqemu_set_notdirty(CPUState *env, ram_addr_t ram_addr)
+void kqemu_set_notdirty(ram_addr_t ram_addr)
 {
 #ifdef DEBUG
 if (loglevel  CPU_LOG_INT) {
@@ -369,6 +358,16 @@ void kqemu_set_notdirty(CPUState *env, ram_addr_t ram_addr)
 ram_pages_to_update[nb_ram_pages_to_update++] = ram_addr;
 }
 
+QEMUAccel kqemu_accel = {
+.cpu_interrupt = kqemu_cpu_interrupt,
+.init_env = kqemu_init_env,
+.flush_cache = kqemu_flush,
+.flush_page = kqemu_flush_page,
+.info = kqemu_info,
+.profile = kqemu_profile,
+.set_notdirty = kqemu_set_notdirty,
+};
+
 static void kqemu_reset_modified_ram_pages(void)
 {
 int i;
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 12/13] [PATCH] remove kqemu reference from hw/pc.c

2008-05-15 Thread Glauber Costa
Instead, route cpu_get_ticks through accel driver.
---
 exec-all.h |   11 +++
 hw/pc.c|   13 ++---
 kqemu.c|4 
 3 files changed, 17 insertions(+), 11 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index 04112e0..f62ff38 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -587,6 +587,7 @@ typedef struct QEMUAccel {
 int (*profile)(CPUState *env, char *buf);
 void (*set_notdirty)(ram_addr_t addr);
 void (*modify_page)(ram_addr_t addr, int dirty_flags);
+uint64_t (*get_real_ticks)(void);
 } QEMUAccel;
 
 extern QEMUAccel *current_accel;
@@ -646,6 +647,16 @@ static inline void accel_modify_page(target_ulong addr, 
int dirty_flags)
 current_accel-modify_page(addr, dirty_flags);
 }
 
+int64_t cpu_get_ticks(void);
+
+static inline uint64_t accel_get_real_ticks(void)
+{
+if (current_accel  current_accel-get_real_ticks)
+   return current_accel-get_real_ticks();
+return cpu_get_ticks();
+}
+
+
 #ifdef USE_KQEMU
 #define KQEMU_MODIFY_PAGE_MASK (0xff  ~(VGA_DIRTY_FLAG | CODE_DIRTY_FLAG))
 
diff --git a/hw/pc.c b/hw/pc.c
index c92384c..43ff2f2 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -32,6 +32,7 @@
 #include smbus.h
 #include boards.h
 #include console.h
+#include exec-all.h
 
 /* output Bochs bios info messages */
 //#define DEBUG_BIOS
@@ -73,17 +74,7 @@ static void ioportF0_write(void *opaque, uint32_t addr, 
uint32_t data)
 /* TSC handling */
 uint64_t cpu_get_tsc(CPUX86State *env)
 {
-/* Note: when using kqemu, it is more logical to return the host TSC
-   because kqemu does not trap the RDTSC instruction for
-   performance reasons */
-#if USE_KQEMU
-if (env-kqemu_enabled) {
-return cpu_get_real_ticks();
-} else
-#endif
-{
-return cpu_get_ticks();
-}
+return accel_get_real_ticks();
 }
 
 /* SMM support */
diff --git a/kqemu.c b/kqemu.c
index 7e24bb7..fbd8b66 100644
--- a/kqemu.c
+++ b/kqemu.c
@@ -412,6 +412,10 @@ QEMUAccel kqemu_accel = {
 .profile = kqemu_profile,
 .set_notdirty = kqemu_set_notdirty,
 .modify_page = kqemu_modify_page,
+/* Note: when using kqemu, it is more logical to return the host TSC
+   because kqemu does not trap the RDTSC instruction for
+   performance reasons */
+.get_real_ticks = cpu_get_real_ticks,
 };
 
 
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 13/13] [PATCH] build list of available accelerators

2008-05-15 Thread Glauber Costa
instead of hardcoding kqemu_start() in exec.c, which would require
such a hack for all available accelerators, semantics of register_qemu_accel()
is changed a little bit. It only builds a list of available accelerators.
The last one registered is the first tried.

This is a temporary solution, since we don't control exactly the order in which
things are loaded by the constructor attributes. The final goal is to have 
command
line switches and priority lists to determine that.

info accelerator is changed to accomodate it. It now prints a list of 
available
accelerators, and only if one of them is active, a detailed description of it 
is printed.
---
 exec-all.h |   43 +--
 exec.c |4 +---
 kqemu.c|   11 +--
 monitor.c  |   18 --
 vl.c   |1 +
 5 files changed, 68 insertions(+), 9 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index f62ff38..eca5cdb 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -579,8 +579,10 @@ static inline target_ulong get_phys_addr_code(CPUState 
*env1, target_ulong addr)
 #endif
 
 typedef struct QEMUAccel {
+char *name;
 void (*cpu_interrupt)(CPUState *env);
 void (*init_env)(CPUState *env);
+int (*start)(void);
 void (*flush_cache)(CPUState *env, int global);
 void (*flush_page)(CPUState *env, target_ulong addr);
 int (*info)(CPUState *env, char *buf);
@@ -590,11 +592,33 @@ typedef struct QEMUAccel {
 uint64_t (*get_real_ticks)(void);
 } QEMUAccel;
 
+typedef struct QEMUCont {
+QEMUAccel *acc;
+int active;
+struct QEMUCont *next;
+} QEMUCont;
+
 extern QEMUAccel *current_accel;
+extern QEMUCont *head;
+void *qemu_mallocz(size_t size);
+
+static inline int register_qemu_accel(QEMUAccel *accel)
+{
+QEMUCont *new;
+
+new = qemu_mallocz(sizeof(*head));
+
+new-acc = accel;
+new-active = 0;
+new-next = head;
+head = new;
+
+return 0;
+}
 
-static inline void register_qemu_accel(QEMUAccel *accel)
+static inline QEMUCont *get_accel_head(void)
 {
-current_accel = accel;
+return head;
 }
 
 static inline void accel_cpu_interrupt(CPUState *env)
@@ -603,6 +627,21 @@ static inline void accel_cpu_interrupt(CPUState *env)
 current_accel-cpu_interrupt(env);
 }
 
+static inline void accel_start(void)
+{
+/* The top accelerator in the list gets tried first, but if it fails,
+ * keep trying until one of them succeeds or we exhaust the list */
+QEMUCont *tmp = head;
+while (tmp) {
+if (tmp-acc  tmp-acc-start  (!(tmp-acc-start())) ) {
+tmp-active = 1;
+current_accel = tmp-acc;
+break;
+}
+tmp = tmp-next; 
+}
+}
+
 static inline void accel_init_env(CPUState *env)
 {
 if (current_accel  current_accel-init_env)
diff --git a/exec.c b/exec.c
index 92f1552..c885f7d 100644
--- a/exec.c
+++ b/exec.c
@@ -334,9 +334,7 @@ void exec_init(void)
 code_gen_ptr = code_gen_buffer;
 page_init();
 io_mem_init();
-#ifdef USE_KQEMU
-kqemu_start();
-#endif
+accel_start();
 }
 
 void cpu_exec_init(CPUState *env)
diff --git a/kqemu.c b/kqemu.c
index fbd8b66..996538d 100644
--- a/kqemu.c
+++ b/kqemu.c
@@ -163,7 +163,6 @@ static void kqemu_update_cpuid(CPUState *env)
accelerated code */
 }
 
-QEMUAccel kqemu_accel;
 extern int smp_cpus;
 
 int kqemu_start(void)
@@ -247,7 +246,6 @@ int kqemu_start(void)
 }
 nb_pages_to_flush = 0;
 nb_ram_pages_to_update = 0;
-register_qemu_accel(kqemu_accel);
 return 0;
 }
 
@@ -404,8 +402,10 @@ void kqemu_modify_page(ram_addr_t ram_addr, int 
dirty_flags)
 }
 
 QEMUAccel kqemu_accel = {
+.name = kqemu,
 .cpu_interrupt = kqemu_cpu_interrupt,
 .init_env = kqemu_init_env,
+.start = kqemu_start,
 .flush_cache = kqemu_flush,
 .flush_page = kqemu_flush_page,
 .info = kqemu_info,
@@ -418,6 +418,13 @@ QEMUAccel kqemu_accel = {
 .get_real_ticks = cpu_get_real_ticks,
 };
 
+static void __attribute__((constructor)) register_kqemu(void)
+{
+if (register_qemu_accel(kqemu_accel)  0)
+fprintf(logfile, kqemu: could not register accelerator \n);
+}
+
+
 
 struct fpstate {
 uint16_t fpuc;
diff --git a/monitor.c b/monitor.c
index 2ee5b0c..49efa2d 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1166,6 +1166,18 @@ static void mem_info(void)
 }
 #endif
 
+static int do_accel_do_list(void)
+{
+QEMUCont *tmp;
+int active = 0;
+for (tmp= get_accel_head(); tmp != NULL; tmp = tmp-next)
+{
+term_printf(%c %s\n, tmp-active ? '*' : ' ', tmp-acc-name);
+active |= tmp-active;
+}
+return active;
+}
+
 #define MAX_BUF 1024
 static void do_info_accelerator(void)
 {
@@ -1179,8 +1191,10 @@ static void do_info_accelerator(void)
 return;
 }
 
-if (accel_info(env, buf))
-term_printf(buf);
+if (do_accel_do_list()) {
+if (accel_info(env, buf))
+term_printf(buf);
+}
 else
 term_printf(No 

[kvm-devel] [PATCH 09/13] [PATCH] move disabling code to kqemu.c instead of vl.c

2008-05-15 Thread Glauber Costa
this is for the case in which we run more than one cpu
---
 kqemu.c |3 ++-
 vl.c|4 
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/kqemu.c b/kqemu.c
index 6d46dfb..94366ec 100644
--- a/kqemu.c
+++ b/kqemu.c
@@ -164,6 +164,7 @@ static void kqemu_update_cpuid(CPUState *env)
 }
 
 QEMUAccel kqemu_accel;
+extern int smp_cpus;
 
 int kqemu_start(void)
 {
@@ -173,7 +174,7 @@ int kqemu_start(void)
 DWORD temp;
 #endif
 
-if (!kqemu_allowed)
+if (!kqemu_allowed || smp_cpus  1)
 return -1;
 
 #ifdef _WIN32
diff --git a/vl.c b/vl.c
index 26c1677..8104e33 100644
--- a/vl.c
+++ b/vl.c
@@ -8357,10 +8357,6 @@ int main(int argc, char **argv)
 exit(1);
 }
 
-#ifdef USE_KQEMU
-if (smp_cpus  1)
-kqemu_allowed = 0;
-#endif
 linux_boot = (kernel_filename != NULL);
 net_boot = (boot_devices_bitmap  ('n' - 'a'))  0xF;
 
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [PATCH 05/13] [PATCH] wrap cache flushing functions into accel drivers

2008-05-15 Thread Glauber Costa
---
 exec-all.h |   16 ++--
 exec.c |   12 ++--
 kqemu.c|   15 +--
 3 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/exec-all.h b/exec-all.h
index 9e211dc..bfc6576 100644
--- a/exec-all.h
+++ b/exec-all.h
@@ -581,6 +581,8 @@ static inline target_ulong get_phys_addr_code(CPUState 
*env1, target_ulong addr)
 typedef struct QEMUAccel {
 void (*cpu_interrupt)(CPUState *env);
 void (*init_env)(CPUState *env);
+void (*flush_cache)(CPUState *env, int global);
+void (*flush_page)(CPUState *env, target_ulong addr);
 } QEMUAccel;
 
 extern QEMUAccel *current_accel;
@@ -602,12 +604,22 @@ static inline void accel_init_env(CPUState *env)
 current_accel-init_env(env);
 }
 
+static inline void accel_flush_cache(CPUState *env, int global)
+{
+if (current_accel  current_accel-flush_cache)
+current_accel-flush_cache(env, global);
+}
+
+static inline void accel_flush_page(CPUState *env, target_ulong addr)
+{
+if (current_accel  current_accel-flush_page)
+current_accel-flush_page(env, addr);
+}
+
 #ifdef USE_KQEMU
 #define KQEMU_MODIFY_PAGE_MASK (0xff  ~(VGA_DIRTY_FLAG | CODE_DIRTY_FLAG))
 
 int kqemu_cpu_exec(CPUState *env);
-void kqemu_flush_page(CPUState *env, target_ulong addr);
-void kqemu_flush(CPUState *env, int global);
 void kqemu_set_notdirty(CPUState *env, ram_addr_t ram_addr);
 void kqemu_modify_page(CPUState *env, ram_addr_t ram_addr);
 void kqemu_record_dump(void);
diff --git a/exec.c b/exec.c
index 73360d3..5b093a3 100644
--- a/exec.c
+++ b/exec.c
@@ -1438,11 +1438,7 @@ void tlb_flush(CPUState *env, int flush_global)
 #if !defined(CONFIG_SOFTMMU)
 munmap((void *)MMAP_AREA_START, MMAP_AREA_END - MMAP_AREA_START);
 #endif
-#ifdef USE_KQEMU
-if (env-kqemu_enabled) {
-kqemu_flush(env, flush_global);
-}
-#endif
+accel_flush_cache(env, flush_global);
 tlb_flush_count++;
 }
 
@@ -1488,11 +1484,7 @@ void tlb_flush_page(CPUState *env, target_ulong addr)
 if (addr  MMAP_AREA_END)
 munmap((void *)addr, TARGET_PAGE_SIZE);
 #endif
-#ifdef USE_KQEMU
-if (env-kqemu_enabled) {
-kqemu_flush_page(env, addr);
-}
-#endif
+accel_flush_page(env, addr);
 }
 
 /* update the TLBs so that writes to code in the virtual page 'addr'
diff --git a/kqemu.c b/kqemu.c
index e0422de..524c74d 100644
--- a/kqemu.c
+++ b/kqemu.c
@@ -261,12 +261,6 @@ void kqemu_cpu_interrupt(CPUState *env)
 #endif
 }
 
-QEMUAccel kqemu_accel = {
-.cpu_interrupt = kqemu_cpu_interrupt,
-.init_env = kqemu_init_env,
-};
-
-
 void kqemu_flush_page(CPUState *env, target_ulong addr)
 {
 #if defined(DEBUG)
@@ -290,6 +284,15 @@ void kqemu_flush(CPUState *env, int global)
 nb_pages_to_flush = KQEMU_FLUSH_ALL;
 }
 
+QEMUAccel kqemu_accel = {
+.cpu_interrupt = kqemu_cpu_interrupt,
+.init_env = kqemu_init_env,
+.flush_cache = kqemu_flush,
+.flush_page = kqemu_flush_page,
+};
+
+
+
 void kqemu_set_notdirty(CPUState *env, ram_addr_t ram_addr)
 {
 #ifdef DEBUG
-- 
1.5.5


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] Re: [PATCH] Add support for a configuration file

2008-05-15 Thread Ian Jackson
andrzej zaborowski writes (Re: [Qemu-devel] Re: [PATCH] Add support for a 
configuration file):
 What I'd love, though, but expect others will consider bloat, is that
 files are passed through cpp before interpreting.

cpp is a terrible preprocessor.  It mostly works for C source code
(although it has some serious deficiences there - for example, that
you can't have macros that generate macros).  But for anything whose
lexical structure is not much like C it causes terrible trouble -
randomly inserting spaces, need for explicit token pasting (dependent
on the C operator set), etc.  It breaks even something as simple as
#-comments, which are obviously what we would want in our plain text
config file.

If you really want a preprocessor, use m4 (one that supports -p and
using a suitable m4_changequote).  m4 isn't particularly pretty but it
doesn't interfere with the core of the syntax.

Better, make the configuration defaults sufficiently good that it is
not necessary to reproduce lots of boilerplate to get things done.
Then macros aren't needed for human-written files - and of course
program-written config files can don't need a macro system anyway.

Ian.

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [Qemu-devel] Re: [PATCH] Add support for a configuration file

2008-05-15 Thread Paul Brook
 Why not just bypass the whole config file idea and just use enviornment
 variables?  

Absolutely not. Environment variables are a horrid way of configuring things.

Paul

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ANNOUNCE] kvm-guest-drivers-windows-2

2008-05-15 Thread Avi Kivity
Anthony Liguori wrote:
 Avi Kivity wrote:
 Anthony Liguori wrote:
 FWIW, virtio-net is much better with my patches applied.

 The can_receive patches?

 Again, I'm not opposed to them in principle, I just think that if 
 they help that this points at a virtio deficiency.  Virtio should 
 never leave the rx queue empty.  Consider the case where the virtio 
 queue isn't tied to a socket buffer, but directly to hardware.

 For RX performance:


 right now
 [  3]  0.0-10.0 sec  1016 MBytes852 Mbits/sec

 revert tap hack
 [  3]  0.0-10.0 sec564 MBytes473 Mbits/sec

 all patches applied
 [  3]  0.0-10.0 sec  1.17 GBytes  1.01 Gbits/sec

 drop lots of packets
 [  3]  0.0-10.0 sec  1.05 GBytes905 Mbits/sec


 The last patch is not in my series but it basically makes the ring 
 size 512 and drops packets when we run out of descriptors.  That was 
 to valid that we're not hiding a virtio deficiency.  The reason I want 
 to buffer packets is that it avoids having to deal with tuning.   For 
 vringfd/vmdq, we'll have to make sure to get the tuning right though.

Okay; I'll apply the patches.  Hopefully we won't diverge too much from 
upstream qemu.


-- 
error compiling committee.c: too many arguments to function


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [ANNOUNCE] kvm-guest-drivers-windows-2

2008-05-15 Thread Anthony Liguori
Avi Kivity wrote:
 Anthony Liguori wrote:
 Avi Kivity wrote:
 Anthony Liguori wrote:
 FWIW, virtio-net is much better with my patches applied.

 The can_receive patches?

 Again, I'm not opposed to them in principle, I just think that if 
 they help that this points at a virtio deficiency.  Virtio should 
 never leave the rx queue empty.  Consider the case where the virtio 
 queue isn't tied to a socket buffer, but directly to hardware.

 For RX performance:


 right now
 [  3]  0.0-10.0 sec  1016 MBytes852 Mbits/sec

 revert tap hack
 [  3]  0.0-10.0 sec564 MBytes473 Mbits/sec

 all patches applied
 [  3]  0.0-10.0 sec  1.17 GBytes  1.01 Gbits/sec

 drop lots of packets
 [  3]  0.0-10.0 sec  1.05 GBytes905 Mbits/sec


 The last patch is not in my series but it basically makes the ring 
 size 512 and drops packets when we run out of descriptors.  That was 
 to valid that we're not hiding a virtio deficiency.  The reason I 
 want to buffer packets is that it avoids having to deal with 
 tuning.   For vringfd/vmdq, we'll have to make sure to get the tuning 
 right though.

 Okay; I'll apply the patches.  Hopefully we won't diverge too much 
 from upstream qemu.

I am going to push these upstream.  I need to finish the page_desc cache 
first b/c right now the version of virtio that could go into upstream 
QEMU has unacceptable performance for KVM.

Regards,

Anthony Liguori



-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] virtio_net null pointer dereference

2008-05-15 Thread Bernd Schubert
Hello,

with 2.6.26-rc2 (git-something from the weekend) I get a NULL pointer
dereference:

(gdb) l *(start_xmit+0x48/0x12e)
0x80413752 is in start_xmit (drivers/net/virtio_net.c:282).
277
278 return vi-svq-vq_ops-add_buf(vi-svq, sg, num, 0, skb);
279 }
280
281 static int start_xmit(struct sk_buff *skb, struct net_device *dev)
282 {
283 struct virtnet_info *vi = netdev_priv(dev);
284
285 again:
286 /* Free up any pending old buffers before queueing new ones.
*/


[17180705.299138] Loglevel set to 9
[17180730.942144] BUG: unable to handle kernel NULL pointer dereference at
0008
[17180730.943115] IP: [8041379a] start_xmit+0x48/0x12e
[17180730.943115] PGD 11d54067 PUD 11d55067 PMD 0
[17180730.943115] Oops: 0002 [1] SMP
[17180730.943115] CPU 0
[17180730.943115] Modules linked in: rtc psmouse i2c_piix4 i2c_core
[17180730.943115] Pid: 2552, comm: iperf Not tainted 2.6.26-rc2 #12
[17180730.943115] RIP: 0010:[8041379a]  [8041379a]
start_xmit+0x48/0x12e
[17180730.943115] RSP: 0018:8100117939e8  EFLAGS: 00010202
[17180730.943115] RAX: 810011d5bcc0 RBX: 810011dc3880 RCX:
810011dc7000
[17180730.943115] RDX:  RSI: 8100117939fc RDI:
8100117bddc0
[17180730.943115] RBP: 810011793a28 R08: 8100117939a8 R09:
0002
[17180730.943115] R10: a43eb07b R11: 810011dc3318 R12:
810011dc3000
[17180730.943115] R13: 810011d5b940 R14: 810011dc3928 R15:
8100117939fc
[17180730.943115] FS:  40d89960(0063) GS:806c()
knlGS:
[17180730.943115] CS:  0010 DS:  ES:  CR0: 8005003b
[17180730.943115] CR2: 0008 CR3: 11deb000 CR4:
06e0
[17180730.943115] DR0:  DR1:  DR2:

[17180730.943115] DR3:  DR6: 0ff0 DR7:
0400
[17180730.943115] Process iperf (pid: 2552, threadinfo 810011792000,
task 8100117be280)
[17180730.943115] Stack:  810011793a38 810011dc3318 05f40246

[17180730.943115]  810011d5b940 810011d5b940 810011dc3000
810011dc3300
[17180730.943115]  810011793a58 80480778 
810011dc3000
[17180730.943115] Call Trace:
[17180730.943115]  [80480778] dev_hard_start_xmit+0x205/0x279
[17180730.943115]  [8048e0cb] __qdisc_run+0xcf/0x1d3
[17180730.943115]  [80482e43] dev_queue_xmit+0x15f/0x2c8
[17180730.943115]  [8049a61c] ip_finish_output+0x1ed/0x22f
[17180730.943115]  [8049a91c] ip_output+0x52/0x54
[17180730.943115]  [80499128] ip_local_out+0x20/0x24
[17180730.943115]  [8049ad2f] ip_queue_xmit+0x2a5/0x2fa
[17180730.943115]  [80265441] ? mark_held_locks+0x59/0x75
[17180730.943115]  [8029a714] ? kmem_cache_alloc_node+0x150/0x185
[17180730.943115]  [80265606] ? trace_hardirqs_on+0xff/0x12a
[17180730.943115]  [804aa8d7] tcp_transmit_skb+0x6b7/0x6ea
[17180730.943115]  [8029a76d] ? __kmalloc_node+0x24/0x29
[17180730.943115]  [804ac7fa] tcp_push_one+0xa7/0xc7
[17180730.943115]  [804a14c7] tcp_sendmsg+0x7d3/0xa5e
[17180730.943115]  [8025c036] ? hrtimer_start+0x118/0x13a
[17180730.943115]  [8025c036] ? hrtimer_start+0x118/0x13a
[17180730.943115]  [804749df] sock_aio_write+0xe2/0xf2
[17180730.943115]  [802a015c] do_sync_write+0xeb/0x132
[17180730.943115]  [802592f8] ? autoremove_wake_function+0x0/0x38
[17180730.943115]  [80224a11] ? native_sched_clock+0x68/0x8f
[17180730.943115]  [802a1655] ? fget_light+0xc0/0xe6
[17180730.943115]  [80224929] ? sched_clock+0x9/0xc
[17180730.943115]  [802a1655] ? fget_light+0xc0/0xe6
[17180730.943115]  [802a0907] vfs_write+0xc1/0x137
[17180730.943115]  [802a0e5d] sys_write+0x47/0x70
[17180730.943115]  [8021dd6a] system_call_after_swapgs+0x8a/0x8f
[17180730.943115]
[17180730.943115]
[17180730.943115] Code: 9e 40 03 00 00 4c 8d b3 a8 00 00 00 eb 3f 41 ff 4e
10 48 8b 17 48 8b 47 08 48 c7 07 00 00 00 00 48 c7 47 08 00 00 00 00 48 89
10 48 89 42 08 48 8b 53 18 8b 47 68 48 01 82 98 00 00 00 48 8b 43
[17180730.943115] RIP  [8041379a] start_xmit+0x48/0x12e
[17180730.943115]  RSP 8100117939e8
[17180730.943115] CR2: 0008
[17180731.066868] ---[ end trace deb46891ec66565a ]---
[17180731.070868] Kernel panic - not syncing: Aiee, killing interrupt
handler!


Thanks,
Bernd


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] pinning, tsc and apic

2008-05-15 Thread Ryan Harper
* Chris Wright [EMAIL PROTECTED] [2008-05-15 02:01]:
 * Anthony Liguori ([EMAIL PROTECTED]) wrote:
   From a quick look, I suspect that the number of wildly off TSC 
  calibrations correspond to the VMs that are misbehaving.  I think this 
  may mean that we have to re-examine the tsc delta computation.
  
  10_serial.log:time.c: Detected 1995.038 MHz processor.
  11_serial.log:time.c: Detected 2363.195 MHz processor.
  12_serial.log:time.c: Detected 2492.675 MHz processor.
  13_serial.log:time.c: Detected 1995.061 MHz processor.
  14_serial.log:time.c: Detected 1994.917 MHz processor.
  15_serial.log:time.c: Detected 4100.735 MHz processor.
  16_serial.log:time.c: Detected 2075.800 MHz processor.
  17_serial.log:time.c: Detected 2674.350 MHz processor.
  18_serial.log:time.c: Detected 1995.002 MHz processor.
  19_serial.log:time.c: Detected 1994.978 MHz processor.
  1_serial.log:time.c: Detected 4384.310 MHz processor.
 
 Is this with pinning?  We at least know we're losing small bits on
 migration.  From my measurements it's ~3000 (outliers are 10-20k).
 
 Also, what happens if you roll back to kvm-userspace 7f5c4d15ece5?

I'll try that next.

 
 I'm using this:
 
On tip, using the patch, I still see hosed guests and tons of apic round
robin output, but the tsc calc seems to have stablized:

/tmp/10_serial.log:time.c: Detected 1995.018 MHz processor.
/tmp/11_serial.log:time.c: Detected 1995.009 MHz processor.
/tmp/12_serial.log:time.c: Detected 1995.012 MHz processor.
/tmp/13_serial.log:time.c: Detected 1995.013 MHz processor.
/tmp/14_serial.log:time.c: Detected 1995.016 MHz processor.
/tmp/15_serial.log:time.c: Detected 1995.020 MHz processor.
/tmp/16_serial.log:time.c: Detected 1995.020 MHz processor.
/tmp/18_serial.log:time.c: Detected 1995.020 MHz processor.
/tmp/19_serial.log:time.c: Detected 1995.023 MHz processor.
/tmp/1_serial.log:time.c: Detected 1995.008 MHz processor.
/tmp/20_serial.log:time.c: Detected 1995.011 MHz processor.
/tmp/21_serial.log:time.c: Detected 1995.016 MHz processor.
/tmp/22_serial.log:time.c: Detected 1995.016 MHz processor.
/tmp/23_serial.log:time.c: Detected 1995.013 MHz processor.
/tmp/24_serial.log:time.c: Detected 1995.018 MHz processor.
/tmp/25_serial.log:time.c: Detected 1995.030 MHz processor.
/tmp/26_serial.log:time.c: Detected 1995.021 MHz processor.
/tmp/27_serial.log:time.c: Detected 1995.026 MHz processor.
/tmp/28_serial.log:time.c: Detected 1995.016 MHz processor.
/tmp/29_serial.log:time.c: Detected 1995.012 MHz processor.
/tmp/2_serial.log:time.c: Detected 1995.020 MHz processor.
/tmp/30_serial.log:time.c: Detected 1995.021 MHz processor.
/tmp/31_serial.log:time.c: Detected 1995.021 MHz processor.
/tmp/32_serial.log:time.c: Detected 1995.008 MHz processor.
/tmp/33_serial.log:time.c: Detected 1995.015 MHz processor.
/tmp/34_serial.log:time.c: Detected 1995.018 MHz processor.
/tmp/35_serial.log:time.c: Detected 1995.017 MHz processor.
/tmp/36_serial.log:time.c: Detected 1995.013 MHz processor.
/tmp/37_serial.log:time.c: Detected 1995.003 MHz processor.
/tmp/38_serial.log:time.c: Detected 1995.036 MHz processor.
/tmp/39_serial.log:time.c: Detected 1995.020 MHz processor.
/tmp/3_serial.log:time.c: Detected 1995.017 MHz processor.
/tmp/40_serial.log:time.c: Detected 1994.998 MHz processor.
/tmp/41_serial.log:time.c: Detected 1995.015 MHz processor.
/tmp/43_serial.log:time.c: Detected 1995.007 MHz processor.
/tmp/44_serial.log:time.c: Detected 1995.029 MHz processor.
/tmp/45_serial.log:time.c: Detected 1995.009 MHz processor.
/tmp/46_serial.log:time.c: Detected 1995.025 MHz processor.
/tmp/47_serial.log:time.c: Detected 1995.019 MHz processor.
/tmp/48_serial.log:time.c: Detected 1995.013 MHz processor.
/tmp/4_serial.log:time.c: Detected 1995.024 MHz processor.
/tmp/5_serial.log:time.c: Detected 1995.016 MHz processor.
/tmp/6_serial.log:time.c: Detected 1995.023 MHz processor.
/tmp/7_serial.log:time.c: Detected 1995.036 MHz processor.
/tmp/8_serial.log:time.c: Detected 1995.013 MHz processor.
/tmp/9_serial.log:time.c: Detected 1995.014 MHz processor.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
[EMAIL PROTECTED]

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] can't boot 2.6.26-rcX

2008-05-15 Thread Bernd Schubert
On Thursday 15 May 2008 12:54:46 Avi Kivity wrote:
 Bernd Schubert wrote:
  On Thursday 15 May 2008 09:36:41 Avi Kivity wrote:
  Bernd Schubert wrote:
  Hello,
 
  there is a problem booting 2.6.26-rcX (X=1,2). It stops booting at
 
  Calibrating delay using timer specific routine.. 4016.92 BogoMIPS
  (lpj=8033846)
 
  The kvm process then takes 100% of my host CPU.
 
  This is with kvm-67 on an AM64-X2-
 
  I'm not yet familiar with kvm and debugging. Will a sysrq+t trace of
  the host show something useful? Or will only full git-bisect help?
 
  Do you have CONFIG_KVM_GUEST or CONFIG_KVM_CLOCK in your config?  If so,
  this may be a paravirt problem.  Try turning them off and let us know.
 
  Thanks, I had both options enabled. Disabling these makes the

 Can you check which one causes the trouble?  Most likely it's the clock.

It is CONFIG_KVM_GUEST, with CONFIG_KVM_CLOCK=y is boots fine!


Thanks,
Bernd

-- 
Bernd Schubert
Q-Leap Networks GmbH

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-15 Thread Christoph Lameter
On Thu, 15 May 2008, Nick Piggin wrote:

 Oh, I get that confused because of the mixed up naming conventions
 there: unmap_page_range should actually be called zap_page_range. But
 at any rate, yes we can easily zap pagetables without holding mmap_sem.

How is that synchronized with code that walks the same pagetable. These 
walks may not hold mmap_sem either. I would expect that one could only 
remove a portion of the pagetable where we have some sort of guarantee 
that no accesses occur. So the removal of the vma prior ensures that?



-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Crash with new guest drivers

2008-05-15 Thread Michael Lilie (mlilie)
Running iperf with 100 connections crashes with the new virtio driver.
The same setup works with e1000.

BSOD data:
 
DRIVER_IRQL_NOT_LESS_OR_EQUAL
 
*** STOP: 0x00D1 (0x001C, 0x0002, 0x, 0xF86FFE03)
***kvmnet.sys - Address F86FFE03 base at F86FF000, DateStamp
4827fdde
 

remote iperf: iperf -c 2.43.181.131 -p 18332 -P 100 -t 10
guest iperf: iperf -s -p 18332

kvm-68, kvm-guest-drivers-windows-2
 
Guest: Win2003 enterprise, SP1
 
Host: 2.6.23 x86_64

Qemu cmdline:  -pidfile /var/run/vb1.pid -daemonize -boot c
-m 512 -name 'VB-05-1 w2k3' -cdrom /local/local1/vbs/w2k3.iso
-net nic,vlan=0,macaddr=00:16:3E:D0:6C:45,model=virtio
-net tap,vlan=0,ifname=vblade10,script=/tmp/vblade10
-drive index=0,if=ide,file=/vbspace/vb1-disk0
-cpu qemu64 -vnc :1 -S -monitor pty /var/run/vb1.qemu
 
 

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Protected mode transitions and big real mode... still an issue

2008-05-15 Thread Mohammed Gamal
On Wed, May 14, 2008 at 10:29 AM, Guillaume Thouvenin
[EMAIL PROTECTED] wrote:
 On Tue, 6 May 2008 20:05:39 +0300
 Mohammed Gamal [EMAIL PROTECTED] wrote:


WinXP fails with the patch applied too. Ubuntu 7.10 live CD and
FreeDOS don't boot but complain about instruction mov 0x11,sreg not
being emulated.

 Mohammed, can you try the patch at the end of this mail? Here it's
 working with FreeDOS now (I added the emulation of 0x90 that is an xchg
 instruction). I can also boot winXP Professional X64 edition. I still
 have a weird issue with Ubuntu 7.10 that crashes sometimes with the
 error:

 kvm_run: failed entry, reason 5
 kvm_run returned -8

 It's a little bit strange because this error appears very often with
 the wmii window manager but never with XFCE. And with wmii, it only
 occurs when I move the mouse above the Qemu/KVM window. If I wait 30s
 until the automatic boot it works...

 So to give a summary, on my box:

  OpensSuse 10.3 - OK
  WinXP Pro X64  - OK
  FreeDOS- OK
  Ubuntu 7.10- NOK

 Regards,
 Guillaume



On Wed, May 14, 2008 at 10:29 AM, Guillaume Thouvenin
[EMAIL PROTECTED] wrote:
 On Tue, 6 May 2008 20:05:39 +0300
 Mohammed Gamal [EMAIL PROTECTED] wrote:


WinXP fails with the patch applied too. Ubuntu 7.10 live CD and
FreeDOS don't boot but complain about instruction mov 0x11,sreg not
being emulated.

 Mohammed, can you try the patch at the end of this mail? Here it's
 working with FreeDOS now (I added the emulation of 0x90 that is an xchg
 instruction). I can also boot winXP Professional X64 edition. I still
 have a weird issue with Ubuntu 7.10 that crashes sometimes with the
 error:

 kvm_run: failed entry, reason 5
 kvm_run returned -8

 It's a little bit strange because this error appears very often with
 the wmii window manager but never with XFCE. And with wmii, it only
 occurs when I move the mouse above the Qemu/KVM window. If I wait 30s
 until the automatic boot it works...

 So to give a summary, on my box:

  OpensSuse 10.3 - OK
  WinXP Pro X64  - OK
  FreeDOS- OK
  Ubuntu 7.10- NOK

 Regards,
 Guillaume


Hi Guillaume,
I still haven't applied the patch you sent now. However I'm using the
patch you last sent me (it's attached in case anyone wants to have a
look). I'm having the same problem with Ubuntu 7.10 Live CD under
GNOME.
Regarding WinXP, I'm using 32-bit WinXP Pro and it crashes with this error:

unhandled vm exit: 0x21 vcpu_id 0
rax 0011 rbx 14fc rcx  rdx
534d
rsi 1d68 rdi 0008164f rsp 14fa rbp
1522
r8   r9   r10  r11

r12  r13  r14  r15

rip 0269 rflags 00010006
cs 2000 (0002/ p 1 dpl 0 db 0 s 1 type b l 0 g 0 avl 0)
ds 22f3 (00022f30/ p 1 dpl 3 db 0 s 1 type 3 l 0 g 0 avl 0)
es  (/ p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
ss 22f3 (00022f30/ p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
fs 0030 (0300/ p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
gs  (/ p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
tr  (/ p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
ldt  (/ p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0)
gdt 17000/3ff
idt 17400/7ff
cr0 11 cr2 0 cr3 0 cr4 0 cr8 0 efer 0
Aborted

and dmesg outputs this:
emulation failed (vmentry failure) rip 269 68 6d 02 cb

The output is the same on every run.

I'll give this patch (and Marcello's) a try and report on what happens.


real_mode_support_20080605.patch
Description: application/mbox
-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] Crash with new guest drivers

2008-05-15 Thread Dor Laor

On Thu, 2008-05-15 at 11:11 -0700, Michael Lilie (mlilie) wrote:
 Running iperf with 100 connections crashes with the new virtio driver.
 The same setup works with e1000.
 
 BSOD data:
  
 DRIVER_IRQL_NOT_LESS_OR_EQUAL

That's not good, I just tested the old driver with xp guest and it was
ok. Probably the locking change might triggered it.
We'll send a fix next week.
Regards,
Dor

  
 *** STOP: 0x00D1 (0x001C, 0x0002, 0x, 0xF86FFE03)
 ***kvmnet.sys - Address F86FFE03 base at F86FF000, DateStamp
 4827fdde
  
 
 remote iperf: iperf -c 2.43.181.131 -p 18332 -P 100 -t 10
 guest iperf: iperf -s -p 18332
 
 kvm-68, kvm-guest-drivers-windows-2
  
 Guest: Win2003 enterprise, SP1
  
 Host: 2.6.23 x86_64
 
 Qemu cmdline:  -pidfile /var/run/vb1.pid -daemonize -boot c
 -m 512 -name 'VB-05-1 w2k3' -cdrom /local/local1/vbs/w2k3.iso
 -net nic,vlan=0,macaddr=00:16:3E:D0:6C:45,model=virtio
 -net tap,vlan=0,ifname=vblade10,script=/tmp/vblade10
 -drive index=0,if=ide,file=/vbspace/vb1-disk0
 -cpu qemu64 -vnc :1 -S -monitor pty /var/run/vb1.qemu
  
 
 
 -
 This SF.net email is sponsored by: Microsoft 
 Defy all challenges. Microsoft(R) Visual Studio 2008. 
 http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
 ___
 kvm-devel mailing list
 kvm-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/kvm-devel


-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] Новые коллекции постельного бе лья

2008-05-15 Thread Подарок
Новые коллекции постельного белья на сайте www.posmagazin.ru
Большой выбор на любой вкус, цвет и кошелёк.
Доставка по Москве, отправка по России!

Заходите www.posmagazin.ru





-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [PATCH 08 of 11] anon-vma-rwsem

2008-05-15 Thread Nick Piggin
On Thu, May 15, 2008 at 10:33:57AM -0700, Christoph Lameter wrote:
 On Thu, 15 May 2008, Nick Piggin wrote:
 
  Oh, I get that confused because of the mixed up naming conventions
  there: unmap_page_range should actually be called zap_page_range. But
  at any rate, yes we can easily zap pagetables without holding mmap_sem.
 
 How is that synchronized with code that walks the same pagetable. These 
 walks may not hold mmap_sem either. I would expect that one could only 
 remove a portion of the pagetable where we have some sort of guarantee 
 that no accesses occur. So the removal of the vma prior ensures that?
 
I don't really understand the question. If you remove the pte and invalidate
the TLBS on the remote image's process (importing the page), then it can
of course try to refault the page in because it's vma is still there. But
you catch that refault in your driver , which can prevent the page from
being faulted back in.

-
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel