Re: nested KVM slower than QEMU with gnumach guest kernel

2014-12-14 Thread Samuel Thibault
Hello,

Just FTR, it seems that the overhead is due to gnumach somtimes using
the PIC quite a lot.  It used not to be too much a concern with just
kvm, but kvm on kvm becomes too expensive for that.  I've fixed gnumach
into being a lot more reasonable, and the performance issues got away.

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-23 Thread Samuel Thibault
Jan Kiszka, le Mon 17 Nov 2014 10:04:37 +0100, a écrit :
 On 2014-11-17 10:03, Samuel Thibault wrote:
  Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
  Do you know how gnumach timekeeping works? Does it have a timer that fires 
  each 1ms?
  Which clock device is it using?
  
  It uses the PIT every 10ms, in square mode
  (PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).
 
 Wow... how retro. That feature might be unsupported

(BTW, I tried the more common ndiv mode, 0x34, with the same result)

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-23 Thread Samuel Thibault
Jan Kiszka, le Mon 17 Nov 2014 07:28:23 +0100, a écrit :
 I suppose this is a SMP host and guest? Does reducing CPUs to 1 change
 to picture?

Oddly enough, putting my host into UniProcessor mode is making L1
realmode simulation awfully slow.  That also happens when binding kvm on
a single hardware thread like this:

hwloc-bind pu:0 kvm ...

but not when binding kvm on the two threads of the same core like this:

hwloc-bind core:0 kvm ...

...

  Here is a sample of trace-cmd output dump: the same kind of pattern
  repeats over and over, with EXTERNAL_INTERRUPT happening mostly
  every other microsecond:
  
   qemu-system-x86-9752  [003]  4106.187755: kvm_exit: reason 
  EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6
   qemu-system-x86-9752  [003]  4106.187756: kvm_entry:vcpu 0
   qemu-system-x86-9752  [003]  4106.187757: kvm_exit: reason 
  EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6
   qemu-system-x86-9752  [003]  4106.187758: kvm_entry:vcpu 0
   qemu-system-x86-9752  [003]  4106.187759: kvm_exit: reason 
  EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6
   qemu-system-x86-9752  [003]  4106.187760: kvm_entry:vcpu 0

Turning into UniProcessor mode however dropped them, but the slowness
is still there.  So they are probably actually not the source of the
issue.  I'm actually wondering whether they weren't simply coming
from the tracing engine itself: I see some irq_work_queue calls from
kernel/trace/ring_buffer.c and kernel/events/core.c.

 You may want to turn on more trace events, if not all, to possibly see
 what Linux does then.

With the EXTERNAL_INTERRUPT mostly away, I mostly get this over and over:

qemu-system-x86-2138  [000]   247.558705: kvm_exit: 
reason MSR_READ rip 0x81050a82 info 0 0
native_read_msr_safe
qemu-system-x86-2138  [000]   247.558705: kvm_msr:  
msr_read 1d9 = 0x0
qemu-system-x86-2138  [000]   247.558705: rcu_utilization:  
Start context switch
qemu-system-x86-2138  [000]   247.558706: rcu_utilization:  
End context switch
qemu-system-x86-2138  [000]   247.558706: kvm_entry:
vcpu 0
qemu-system-x86-2138  [000]   247.558706: kvm_exit: 
reason VMRESUME rip 0xa03058ae info 0 0
vmx_vcpu_run
qemu-system-x86-2138  [000]   247.558711: kvm_mmu_get_page: 
[FAILED TO PARSE] mmu_valid_gen=0x26 gfn=248173 role=114692 root_count=0 
unsync=0 created=0
qemu-system-x86-2138  [000]   247.558712: rcu_utilization:  
Start context switch
qemu-system-x86-2138  [000]   247.558712: rcu_utilization:  
End context switch
qemu-system-x86-2138  [000]   247.558712: kvm_entry:
vcpu 0
qemu-system-x86-2138  [000]   247.558712: kvm_exit: 
reason IO_INSTRUCTION rip 0xc0109769 info a10040 0
gnumach accesses the PIC
qemu-system-x86-2138  [000]   247.558713: kvm_nested_vmexit:
rip: 0xc0109769 reason: IO_INSTRUCTION ext_inf1: 0x00a10040 
ext_inf2: 0x ext_int: 0x ext_int_err: 0x
qemu-system-x86-2138  [000]   247.558713: 
kvm_nested_vmexit_inject: reason: IO_INSTRUCTION ext_inf1: 0x00a10040 
ext_inf2: 0x ext_int: 0x ext_int_err: 0x
qemu-system-x86-2138  [000]   247.558718: kvm_mmu_get_page: 
[FAILED TO PARSE] mmu_valid_gen=0x26 gfn=0 role=122884 root_count=0 unsync=0 
created=0
qemu-system-x86-2138  [000]   247.558718: rcu_utilization:  
Start context switch
qemu-system-x86-2138  [000]   247.558719: rcu_utilization:  
End context switch
qemu-system-x86-2138  [000]   247.558719: kvm_entry:
vcpu 0
qemu-system-x86-2138  [000]   247.558719: kvm_exit: 
reason VMREAD rip 0xa0305956 info 0 0
vmx_vcpu_run
qemu-system-x86-2138  [000]   247.558720: rcu_utilization:  
Start context switch
qemu-system-x86-2138  [000]   247.558720: rcu_utilization:  
End context switch
qemu-system-x86-2138  [000]   247.558720: kvm_entry:
vcpu 0
qemu-system-x86-2138  [000]   247.558721: kvm_exit: 
reason VMREAD rip 0xa030596f info 0 0
vmx_vcpu_run
qemu-system-x86-2138  [000]   247.558721: rcu_utilization:  
Start context switch
qemu-system-x86-2138  [000]   247.558721: rcu_utilization:  
End context switch
qemu-system-x86-2138  [000]   247.558721: kvm_entry:
vcpu 0
qemu-system-x86-2138  [000]   247.558722: kvm_exit: 
reason VMREAD rip 0xa02fb333 info 0 0
vmx_read_l1_tsc
qemu-system-x86-2138  [000]   

Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Samuel Thibault
Jan Kiszka, le Mon 17 Nov 2014 07:28:23 +0100, a écrit :
  AIUI, the external interrupt is 0xf6, i.e. Linux' IRQ_WORK_VECTOR.  I
  however don't see any of them, neither in L0's /proc/interrupts, nor in
  L1's /proc/interrupts...
 
 I suppose this is a SMP host and guest?

L0 is a hyperthreaded quad-core, but L1 is only 1 VCPU.  In the trace,
L1 happens to have been apparently always scheduled on the same L0 CPU:
trace-cmd tells me that CPU [0-24-7] are empty.

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Samuel Thibault
Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
 Do you know how gnumach timekeeping works? Does it have a timer that fires 
 each 1ms?
 Which clock device is it using?

It uses the PIT every 10ms, in square mode
(PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Samuel Thibault
Jan Kiszka, le Mon 17 Nov 2014 10:04:37 +0100, a écrit :
 On 2014-11-17 10:03, Samuel Thibault wrote:
  Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
  Do you know how gnumach timekeeping works? Does it have a timer that fires 
  each 1ms?
  Which clock device is it using?
  
  It uses the PIT every 10ms, in square mode
  (PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).
 
 Wow... how retro. That feature might be unsupported - does user space
 irqchip work better?

I had indeed tried giving -machine kernel_irqchip=off to the L2 kvm,
with the same bad performance and external_interrupt in the trace.

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Samuel Thibault
Gleb Natapov, le Mon 17 Nov 2014 11:21:22 +0200, a écrit :
 On Mon, Nov 17, 2014 at 10:10:25AM +0100, Samuel Thibault wrote:
  Jan Kiszka, le Mon 17 Nov 2014 10:04:37 +0100, a écrit :
   On 2014-11-17 10:03, Samuel Thibault wrote:
Gleb Natapov, le Mon 17 Nov 2014 10:58:45 +0200, a écrit :
Do you know how gnumach timekeeping works? Does it have a timer that 
fires each 1ms?
Which clock device is it using?

It uses the PIT every 10ms, in square mode
(PIT_C0|PIT_SQUAREMODE|PIT_READMODE = 0x36).
   
   Wow... how retro. That feature might be unsupported - does user space
   irqchip work better?
  
  I had indeed tried giving -machine kernel_irqchip=off to the L2 kvm,
  with the same bad performance and external_interrupt in the trace.
  
 They are always be in the trace, but do you see them each ms or each 10ms
 with user space irqchip?

The external interupts are every 1 *microsecond, not millisecond. With
irqchip=off or not.

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-17 Thread Samuel Thibault
Also, I have made gnumach show a timer counter, it does get PIT
interrupts every 10ms as expected, not more often.

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: nested KVM slower than QEMU with gnumach guest kernel

2014-11-16 Thread Samuel Thibault
Hello,

Jan Kiszka, le Wed 12 Nov 2014 00:42:52 +0100, a écrit :
 On 2014-11-11 19:55, Samuel Thibault wrote:
  jenkins.debian.net is running inside a KVM VM, and it runs nested
  KVM guests for its installation attempts.  This goes fine with Linux
  kernels, but it is extremely slow with gnumach kernels.

 You can try to catch a trace (ftrace) on the physical host.
 
 I suspect the setup forces a lot of instruction emulation, either on L0
 or L1. And that is slower than QEMU is KVM does not optimize like QEMU does.

Here is a sample of trace-cmd output dump: the same kind of pattern
repeats over and over, with EXTERNAL_INTERRUPT happening mostly
every other microsecond:

 qemu-system-x86-9752  [003]  4106.187755: kvm_exit: reason 
EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6
 qemu-system-x86-9752  [003]  4106.187756: kvm_entry:vcpu 0
 qemu-system-x86-9752  [003]  4106.187757: kvm_exit: reason 
EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6
 qemu-system-x86-9752  [003]  4106.187758: kvm_entry:vcpu 0
 qemu-system-x86-9752  [003]  4106.187759: kvm_exit: reason 
EXTERNAL_INTERRUPT rip 0xa02848b1 info 0 80f6
 qemu-system-x86-9752  [003]  4106.187760: kvm_entry:vcpu 0

The various functions being interrupted are vmx_vcpu_run
(0xa02848b1 and 0xa0284972), handle_io
(0xa027ee62), vmx_get_cpl (0xa027a7de),
load_vmc12_host_state (0xa027ea31), native_read_tscp
(0x81050a84), native_write_msr_safe (0x81050aa6),
vmx_decache_cr0_guest_bits (0xa027a384),
vmx_handle_external_intr (0xa027a54d).

AIUI, the external interrupt is 0xf6, i.e. Linux' IRQ_WORK_VECTOR.  I
however don't see any of them, neither in L0's /proc/interrupts, nor in
L1's /proc/interrupts...

Samuel


trace.bz2
Description: Binary data


nested KVM slower than QEMU with gnumach guest kernel

2014-11-11 Thread Samuel Thibault
Hello,

jenkins.debian.net is running inside a KVM VM, and it runs nested
KVM guests for its installation attempts.  This goes fine with Linux
kernels, but it is extremely slow with gnumach kernels.  I have
reproduced the issue with my laptop with a linux 3.17 host kernel, a
3.16 L1-guest kernel, and an i7-2720QM CPU, with similar results; it's
actually even slower than letting qemu emulate the CPU... For these
tests I'm using the following image:

http://people.debian.org/~sthibault/tmp/netinst.iso

The reference test here boils down to running qemu -cdrom netinst.iso -m
512, choosing the Automated install choice, and waiting for Loading
additional components step to complete. (yes, the boot menu gets
mangled ATM, there's apparently currently a bug between qemu and grub)

My host is A, my level1-KVM-guest is B.

KVM:
A$ qemu -enable-kvm -cdrom netinst.iso -m 512M
takes ~1 minute.

QEMU:
A$ qemu -cdrom netinst.iso -m 512M
takes ~7 minutes.

KVM-in-KVM:
B$ qemu -enable-kvm -cdrom netinst.iso -m 512M
takes ~10 minutes, when it doesn't gets completely stuck, which is quite
often, actually...

QEMU-in-KVM:
B$ qemu -cdrom netinst.iso -m 512M
takes ~7 minutes.

I don't see such horrible slowdown with a linux image.  Is there
something particular that could explain such a difference?  What tools
or counters could I use to investigate which area of KVM is getting
slow?

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/4] Really lazy fpu

2010-06-16 Thread Samuel Thibault
Ingo Molnar, le Wed 16 Jun 2010 10:39:41 +0200, a écrit :
 in the long run most processes will be using the FPU due to SIMM  
 instructions.

I believe glibc already uses SIMM instructions for e.g. memcpy and
friends, i.e. basically all applications...

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC] allow multi-core guests: introduce cores= option to -cpu

2009-07-03 Thread Samuel Thibault
Andre Przywara, le Fri 03 Jul 2009 16:41:56 +0200, a écrit :
 -smp 16 -cpu host,cores=8

That means 8 cores with 2 threads each, thus 16 threads? Ok, that can be
later generalized into for instance

-smp 16 -cpu host,nodes=2,sockets=2,cores=2

to define 2 NUMA nodes of 2 sockets of 2 cores, each core thus having 16
threads.

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC] allow multi-core guests: introduce cores= option to -cpu

2009-07-03 Thread Samuel Thibault
Andre Przywara, le Sat 04 Jul 2009 01:28:43 +0200, a écrit :
 Maybe one could describe cores, threads, sockets and nodes in -smp and 
 declare the memory topology only in -numa.

Mmm, I'd rather just describe both in a -topology option.

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Re: Question about KVM and PC speaker

2009-05-04 Thread Samuel Thibault
Jan Kiszka, le Mon 04 May 2009 22:29:39 +0200, a écrit :
  When I boot the VM from the Lenny CD, there is no audible signal tone.
 
 Hmm, I successfully tested with '-soundbw pcspk' + my patches or
 -no-kvm-pit. There is probably a different, unrelated issue with your setup.

Remember that the BIOS support for beeps is probably still missing.
Simon, you should also test beeps from an installed Linux guest.

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [libvirt] RE: [Qemu-devel] [ANNOUNCE] virt-mem tools version 0.2.8 released

2008-08-07 Thread Samuel Thibault
Alexey Eremenko, le Thu 07 Aug 2008 15:55:49 +0300, a écrit :
 The only problem: virt-mem doesn't compiles.
 
 checking for ocamldoc... ocamldoc
 checking for ocamlfind... ./configure: line 5121: WARNING:: command not found
 no
 configure: error: OCaml findlib is required
 
 And I have installed Ocam.

ocaml findlib is not part of the main caml distribution, look for
something like ocaml-findlib.

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC][PATCH] Add HPET emulation to qemu (v2)

2008-08-02 Thread Samuel Thibault
Beth Kon, le Sat 02 Aug 2008 06:05:14 -0500, a écrit :
 I was trying to reproduce the wakeup every 10ms that 
 Samuel Thibault mentioned, thinking the HPET would improve it. 
 But for an idle guest in both cases (with and without HPET), the 
 number of wakeups per second was relatively low (28).

I was referring to vl.c's timeout = 10; which makes the select call
use a timeout of 10ms. That said, /* If all cpus are halted then wait
until the next IRQ */, so maybe that's why you get slower wakeups per
second.  I'm still surprised because of the call to qemu_mod_timer in
pit_irq_timer_update which should setup at least a 100Hz timer with
linux guests (when they don't have HPET available).

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC][PATCH] Add HPET emulation to qemu (v2)

2008-08-02 Thread Samuel Thibault
Anthony Liguori, le Sat 02 Aug 2008 09:46:30 -0500, a écrit :
 Samuel Thibault wrote:
 Beth Kon, le Sat 02 Aug 2008 06:05:14 -0500, a écrit :
   
 I was trying to reproduce the wakeup every 10ms that 
 Samuel Thibault mentioned, thinking the HPET would improve it. 
 But for an idle guest in both cases (with and without HPET), the 
 number of wakeups per second was relatively low (28).
 
 
 I was referring to vl.c's timeout = 10; which makes the select call
 use a timeout of 10ms. That said, /* If all cpus are halted then wait
 until the next IRQ */, so maybe that's why you get slower wakeups per
 second.  I'm still surprised because of the call to qemu_mod_timer in
 pit_irq_timer_update which should setup at least a 100Hz timer with
 linux guests (when they don't have HPET available).
   
 
 The patch disables that when hpet is active.

That's why I would expect, indeed, but he is reporting that _without_
HPET he gets low wakeups per second already.

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [RFC][PATCH] Add HPET emulation to qemu

2008-07-10 Thread Samuel Thibault
Cool!
Does it now happen that qemu no longer wakes up every 10ms?  If not,
please try to make sure it happens, that would eventually fix that power
leak :)

Samuel
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html