Andrew Theurer wrote:
I wanted to share some performance data for KVM and Xen.  I thought it
would be interesting to share some performance results especially
compared to Xen, using a more complex situation like heterogeneous
server consolidation.

The Workload:
The workload is one that simulates a consolidation of servers on to a
single host.  There are 3 server types: web, imap, and app (j2ee).  In
addition, there are other "helper" servers which are also consolidated:
a db server, which helps out with the app server, and an nfs server,
which helps out with the web server (a portion of the docroot is nfs
mounted).  There is also one other server that is simply idle.  All 6
servers make up one set.  The first 3 server types are sent requests,
which in turn may send requests to the db and nfs helper servers.  The
request rate is throttled to produce a fixed amount of work.  In order
to increase utilization on the host, more sets of these servers are
used.  The clients which send requests also have a response time
requirement which is monitored.  The following results have passed the
response time requirements.


What's the typical I/O load (disk and network bandwidth) while the tests are running?

The host hardware:
A 2 socket, 8 core Nehalem with SMT, and EPT enabled, lots of disks, 4 x
1 GB Ethenret

CPU time measurements with SMT can vary wildly if the system is not fully loaded. If the scheduler happens to schedule two threads on a single core, both of these threads will generate less work compared to if they were scheduled on different cores.


Test Results:
The throughput is equal in these tests, as the clients throttle the work
(this is assuming you don't run out of a resource on the host).  What's
telling is the CPU used to do the same amount of work:

Xen:  52.85%
KVM:  66.93%

So, KVM requires 66.93/52.85 = 26.6% more CPU to do the same amount of
work. Here's the breakdown:

total    user    nice  system     irq softirq   guest
66.90    7.20    0.00   12.94    0.35    3.39   43.02

Comparing guest time to all other busy time, that's a 23.88/43.02 = 55%
overhead for virtualization.  I certainly don't expect it to be 0, but
55% seems a bit high.  So, what's the reason for this overhead?  At the
bottom is oprofile output of top functions for KVM.  Some observations:

1) I'm seeing about 2.3% in scheduler functions [that I recognize].
Does that seems a bit excessive?

Yes, it is. If there is a lot of I/O, this might be due to the thread pool used for I/O.

2) cpu_physical_memory_rw due to not using preadv/pwritev?

I think both virtio-net and virtio-blk use memcpy().

3) vmx_[save|load]_host_state: I take it this is from guest switches?

These are called when you context-switch from a guest, and, much more frequently, when you enter qemu.

We have 180,000 context switches a second.  Is this more than expected?


Way more.  Across 16 logical cpus, this is >10,000 cs/sec/cpu.

I wonder if schedstats can show why we context switch (need to let
someone else run, yielded, waiting on io, etc).


Yes, there is a scheduler tracer, though I have no idea how to operate it.

Do you have kvm_stat logs?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to