Hi all
we are evaluating the purchase of infiniband hardware together with
Nehalem-based servers, for KVM use.
What performance can we roughly expect with virtio-net over infiniband
hardware?
Communication would be primarily for these 2 purposes:
1- VM to a HN: primarily NFSv4 imports from
Shirley Ma a écrit :
> This patch is generated against 2.6 git tree. I didn't break up this
> patch since it has one functionality. Please review it.
>
> Thanks
> Shirley
>
> Signed-off-by: Shirley Ma
> --
>
> +void virtio_free_pages(void *buf)
> +{
> + struct page *page = (struct page
This patch is generated against 2.6 git tree. I didn't break up this
patch since it has one functionality. Please review it.
Thanks
Shirley
Signed-off-by: Shirley Ma
--
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b9e002f..6fb788b 100644
--- a/drivers/net/virtio_ne
Guest virtio_net receives packets from its pre-allocated vring
buffers, then it delivers these packets to upper layer protocols
as skb buffs. So it's not necessary to pre-allocate skb for each
mergable buffer, then frees it when it's useless.
This patch has deferred skb allocation when receiving
On 11/20/09 09:59, Alexander Graf wrote:
>
> On 20.11.2009, at 02:54, Jeremy Fitzhardinge wrote:
>
>> On 11/20/09 07:58, Alexander Graf wrote:
>>>
>>> Am 19.11.2009 um 23:55 schrieb Jeremy Fitzhardinge :
>>>
On 11/18/09 20:56, Alexander Graf wrote:
> Currently we use pv-ops to tell linux n
On 20.11.2009, at 02:54, Jeremy Fitzhardinge wrote:
On 11/20/09 07:58, Alexander Graf wrote:
Am 19.11.2009 um 23:55 schrieb Jeremy Fitzhardinge :
On 11/18/09 20:56, Alexander Graf wrote:
Currently we use pv-ops to tell linux not to do anything on
io_delay.
While the basic idea is good I
On 11/20/09 07:58, Alexander Graf wrote:
>
> Am 19.11.2009 um 23:55 schrieb Jeremy Fitzhardinge :
>
>> On 11/18/09 20:56, Alexander Graf wrote:
>>> Currently we use pv-ops to tell linux not to do anything on io_delay.
>>>
>>> While the basic idea is good IMHO, I don't see why we would need pv-ops
>
Am 19.11.2009 um 23:55 schrieb Jeremy Fitzhardinge :
On 11/18/09 20:56, Alexander Graf wrote:
Currently we use pv-ops to tell linux not to do anything on io_delay.
While the basic idea is good IMHO, I don't see why we would need pv-
ops
for that. The io delay function already has a switch th
On 11/18/09 20:56, Alexander Graf wrote:
> Currently we use pv-ops to tell linux not to do anything on io_delay.
>
> While the basic idea is good IMHO, I don't see why we would need pv-ops
> for that. The io delay function already has a switch that can do nothing
> if you're so inclined.
>
> So her
Hi!
First time here, I am old user of vmware-server that recently moved to
KVM. I am excited with KVM and I don't want to go back to vmware
server for no reason. However as a developer I sometimes need to work
with windows desktops and vmware workstation does a good job on this.
I know kernel VT ca
The Buildbot has detected a new failure of default_i386_out_of_tree on qemu-kvm.
Full details are available at:
http://buildbot.b1-systems.de/qemu-kvm/builders/default_i386_out_of_tree/builds/99
Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/
Buildslave for this Build: b1_qemu_kvm_2
Buil
The Buildbot has detected a new failure of default_x86_64_out_of_tree on
qemu-kvm.
Full details are available at:
http://buildbot.b1-systems.de/qemu-kvm/builders/default_x86_64_out_of_tree/builds/101
Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/
Buildslave for this Build: b1_qemu_kvm_1
The Buildbot has detected a new failure of default_i386_debian_5_0 on qemu-kvm.
Full details are available at:
http://buildbot.b1-systems.de/qemu-kvm/builders/default_i386_debian_5_0/builds/162
Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/
Buildslave for this Build: b1_qemu_kvm_2
Build
The Buildbot has detected a new failure of default_x86_64_debian_5_0 on
qemu-kvm.
Full details are available at:
http://buildbot.b1-systems.de/qemu-kvm/builders/default_x86_64_debian_5_0/builds/160
Buildbot URL: http://buildbot.b1-systems.de/qemu-kvm/
Buildslave for this Build: b1_qemu_kvm_1
> It's actually less readable. I know 11 is between 10 and 13, but is
> NP_VECTOR between TS_VECTOR and GP_VECTOR?
>
> This is better as a switch, or even:
>
> u8 exception_class[] = {
>[PF_VECTOR] EXPT_PF,
>
> etc.
OK what about this then:
From: Eddie Dong
Move Double-Fault generation lo
Avi Kivity wrote:
> On 11/17/2009 03:21 PM, Avi Kivity wrote:
>> qemu-kvm's switch to seabios uncovered a regression with cdrom
>> handling. Vista x64 no longer recognizes the cdrom, while pc-bios
>> still works. Installing works, but that uses int 13, not the native
>> driver. Haven't investiga
On 11/17/2009 03:21 PM, Avi Kivity wrote:
qemu-kvm's switch to seabios uncovered a regression with cdrom
handling. Vista x64 no longer recognizes the cdrom, while pc-bios
still works. Installing works, but that uses int 13, not the native
driver. Haven't investigated further yet.
Command l
Jeremy Fitzhardinge wrote:
> On 11/18/09 08:13, Alexander Graf wrote:
>
>> Currently when using paravirt ops it's an all-or-nothing option. We can
>> either
>> use pv-ops for CPU, MMU, timing, etc. or not at all.
>>
>> Now there are some use cases where we don't need the full feature set, but
On 11/18/09 08:13, Alexander Graf wrote:
> Currently when using paravirt ops it's an all-or-nothing option. We can either
> use pv-ops for CPU, MMU, timing, etc. or not at all.
>
> Now there are some use cases where we don't need the full feature set, but
> only
> a small chunk of it. KVM is a pre
Kevin Wolf wrote:
> Hi Jan,
>
> Am 19.11.2009 13:19, schrieb Jan Kiszka:
>> (gdb) print ((BDRVQcowState *)bs->opaque)->cluster_allocs.lh_first
>> $5 = (struct QCowL2Meta *) 0xcb3568
>> (gdb) print *((BDRVQcowState *)bs->opaque)->cluster_allocs.lh_first
>> $6 = {offset = 7417176064, n_start = 0,
Hi Jan,
Am 19.11.2009 13:19, schrieb Jan Kiszka:
> (gdb) print ((BDRVQcowState *)bs->opaque)->cluster_allocs.lh_first
> $5 = (struct QCowL2Meta *) 0xcb3568
> (gdb) print *((BDRVQcowState *)bs->opaque)->cluster_allocs.lh_first
> $6 = {offset = 7417176064, n_start = 0, nb_available = 16, nb_cluste
On 11/19/2009 03:39 PM, Kevin O'Connor wrote:
On Thu, Nov 19, 2009 at 03:10:20PM +0200, Avi Kivity wrote:
Trying to debug the cdrom issue, I see
Compiling whole program out/ccode32.o
src/util.c: In function ‘__end_thread’:
src/util.c:183: internal compiler error: in simplify_subreg, at
s
From: Zachary Amsden
If cpufreq can't determine the CPU khz, or cpufreq is not compiled in,
we should fallback to the measured TSC khz.
Signed-off-by: Zachary Amsden
Signed-off-by: Marcelo Tosatti
---
arch/x86/kvm/x86.c | 16
1 files changed, 12 insertions(+), 4 deletions(-
From: Mark Langsdorf
New AMD processors (Family 0x10 models 8+) support the Pause
Filter Feature. This feature creates a new field in the VMCB
called Pause Filter Count. If Pause Filter Count is greater
than 0 and intercepting PAUSEs is enabled, the processor will
increment an internal counter
From: Zhai, Edwin
Introduce kvm_vcpu_on_spin, to be used by VMX/SVM to yield processing
once the cpu detects pause-based looping.
Signed-off-by: "Zhai, Edwin"
Signed-off-by: Marcelo Tosatti
---
include/linux/kvm_host.h |1 +
virt/kvm/kvm_main.c | 15 +++
2 files changed
From: Jan Kiszka
This (broken) check dates back to the days when this code was shared
across architectures. x86 has IOMEM, so drop it.
Signed-off-by: Jan Kiszka
Signed-off-by: Marcelo Tosatti
---
arch/x86/kvm/x86.c |2 --
1 files changed, 0 insertions(+), 2 deletions(-)
diff --git a/arch
From: Jan Kiszka
Push the NMI-related singlestep variable into vcpu_svm. It's dealing
with an AMD-specific deficit, nothing generic for x86.
Acked-by: Gleb Natapov
Signed-off-by: Jan Kiszka
arch/x86/include/asm/kvm_host.h |1 -
arch/x86/kvm/svm.c | 12 +++-
2 files
From: Zhai, Edwin
New NHM processors will support Pause-Loop Exiting by adding 2 VM-execution
control fields:
PLE_Gap- upper bound on the amount of time between two successive
executions of PAUSE in a loop.
PLE_Window - upper bound on the amount of time a guest is allowed to exec
From: Marcelo Tosatti
There's no kvm_run argument anymore.
Signed-off-by: Marcelo Tosatti
---
arch/x86/kvm/vmx.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a4580d6..364263a 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x
From: Ed Swierk
Support for Xen PV-on-HVM guests can be implemented almost entirely in
userspace, except for handling one annoying MSR that maps a Xen
hypercall blob into guest address space.
A generic mechanism to delegate MSR writes to userspace seems overkill
and risks encouraging similar MSR
From: Eduardo Habkost
svm_vcpu_reset() was not properly resetting the contents of the guest-visible
cr0 register, causing the following issue:
https://bugzilla.redhat.com/show_bug.cgi?id=525699
Without resetting cr0 properly, the vcpu was running the SIPI bootstrap routine
with paging enabled, m
From: Eduardo Habkost
This should have no effect, it is just to make the code clearer.
Signed-off-by: Eduardo Habkost
Signed-off-by: Avi Kivity
---
arch/x86/kvm/vmx.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 364263
From: Glauber Costa
When we migrate a kvm guest that uses pvclock between two hosts, we may
suffer a large skew. This is because there can be significant differences
between the monotonic clock of the hosts involved. When a new host with
a much larger monotonic time starts running the guest, the
From: Arnd Bergmann
With big endian userspace, we can't quite figure out if a pointer
is 32 bit (shifted >> 32) or 64 bit when we read a 64 bit pointer.
This is what happens with dirty logging. To get the pointer interpreted
correctly, we thus need Arnd's patch to implement a compat layer for
th
From: Marcelo Tosatti
find_first_zero_bit works with bit numbers, not bytes.
Fixes
https://sourceforge.net/tracker/?func=detail&aid=2847560&group_id=180599&atid=893831
Reported-by: "Xu, Jiajun"
Cc: sta...@kernel.org
Signed-off-by: Marcelo Tosatti
---
virt/kvm/irq_comm.c |7 +++
1 fi
From: Jan Kiszka
Commit 705c5323 opened the doors of hell by unconditionally injecting
single-step flags as long as guest_debug signaled this. This doesn't
work when the guest branches into some interrupt or exception handler
and triggers a vmexit with flag reloading.
Fix it by saving cs:rip whe
From: Marcelo Tosatti
GUEST_CR3 is updated via kvm_set_cr3 whenever CR3 is modified from
outside guest context. Similarly pdptrs are updated via load_pdptrs.
Let kvm_set_cr3 perform the update, removing it from the vcpu_run
fast path.
Signed-off-by: Marcelo Tosatti
Acked-by: Acked-by: Sheng Ya
Instead of reloading syscall MSRs on every preemption, use the new shared
msr infrastructure to reload them at the last possible minute (just before
exit to userspace).
Improves vcpu/idle/vcpu switches by about 2000 cycles (when EFER needs to be
reloaded as well).
[jan: fix slot index missing ind
Currently MSR_KERNEL_GS_BASE is saved and restored as part of the
guest/host msr reloading. Since we wish to lazy-restore all the other
msrs, save and reload MSR_KERNEL_GS_BASE explicitly instead of using
the common code.
Signed-off-by: Avi Kivity
---
arch/x86/kvm/vmx.c | 39 +
From: Marcelo Tosatti
Otherwise kvm might attempt to dereference a NULL pointer.
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity
---
virt/kvm/irq_comm.c |5 -
1 files changed, 4 insertions(+), 1 deletions(-)
diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 0d454d
From: Eduardo Habkost
The svm_set_cr0() call will initialize save->cr0 properly even when npt is
enabled, clearing the NW and CD bits as expected, so we don't need to
initialize it manually for npt_enabled anymore.
Signed-off-by: Eduardo Habkost
Signed-off-by: Avi Kivity
---
arch/x86/kvm/svm.
From: Jan Kiszka
Obviously, people tend to extend this header at the bottom - more or
less blindly. Ensure that deprecated stuff gets its own corner again by
moving things to the top. Also add some comments and reindent IOCTLs to
make them more readable and reduce the risk of number collisions.
On Thu, Nov 19, 2009 at 03:10:20PM +0200, Avi Kivity wrote:
> Trying to debug the cdrom issue, I see
>
> Compiling whole program out/ccode32.o
> src/util.c: In function ‘__end_thread’:
> src/util.c:183: internal compiler error: in simplify_subreg, at
> simplify-rtx.c:5055
>
> (with F12's gcc (G
The various syscall-related MSRs are fairly expensive to switch. Currently
we switch them on every vcpu preemption, which is far too often:
- if we're switching to a kernel thread (idle task, threaded interrupt,
kernel-mode virtio server (vhost-net), for example) and back, then
there's no nee
From: Marcelo Tosatti
Large page translations are always synchronized (either in level 3
or level 2), so its not necessary to properly deal with them
in the invlpg handler.
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity
---
arch/x86/kvm/paging_tmpl.h |1 -
1 files changed, 0 ins
From: Jan Kiszka
Decouple KVM_GUESTDBG_INJECT_DB and KVM_GUESTDBG_INJECT_BP from
KVM_GUESTDBG_ENABLE, their are actually orthogonal. At this chance,
avoid triggering the WARN_ON in kvm_queue_exception if there is already
an exception pending and reject such invalid requests.
Signed-off-by: Jan K
This variable is used to communicate between a caller and a callee; switch
to a function argument instead.
Signed-off-by: Avi Kivity
---
arch/x86/kvm/vmx.c | 10 +++---
1 files changed, 3 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index a5f3f3e..c9c
From: Marcelo Tosatti
Otherwise kvm might attempt to dereference a NULL pointer.
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity
---
arch/x86/kvm/x86.c |6 ++
1 files changed, 6 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 97f6f95.
From: Marcelo Tosatti
Otherwise kvm will leak memory on multiple KVM_CREATE_IRQCHIP.
Also serialize multiple accesses with kvm->lock.
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity
---
arch/x86/kvm/irq.h |6 +-
arch/x86/kvm/x86.c | 30 ++
2 file
These happen when we trap an exception when another exception is being
delivered; we only expect these with MCEs and page faults. If something
unexpected happens, things probably went south and we're better off reporting
an internal error and freezing.
Signed-off-by: Avi Kivity
---
arch/x86/kvm
From: Hollis Blanchard
The old BUILD_BUG_ON implementation didn't work with __builtin_constant_p().
Fixing that revealed this test had been inverted for a long time without
anybody noticing...
Signed-off-by: Hollis Blanchard
Signed-off-by: Avi Kivity
---
arch/powerpc/kvm/timing.h |2 +-
1
From: Gleb Natapov
Probably introduced by a bad merge.
Signed-off-by: Gleb Natapov
Signed-off-by: Avi Kivity
---
arch/x86/kvm/x86.c |5 -
1 files changed, 0 insertions(+), 5 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 58c5cdd..dbddcc2 100644
--- a/arch/x86
Usually userspace will freeze the guest so we can inspect it, but some
internal state is not available. Add extra data to internal error
reporting so we can expose it to the debugger. Extra data is specific
to the suberror.
Signed-off-by: Avi Kivity
---
arch/x86/kvm/mmu.c |1 +
arch/x86/k
From: Jan Kiszka
This new IOCTL exports all yet user-invisible states related to
exceptions, interrupts, and NMIs. Together with appropriate user space
changes, this fixes sporadic problems of vmsave/restore, live migration
and system reset.
[avi: future-proof abi by adding a flags field]
Signe
From: Joerg Roedel
This patch adds a tracepoint for the event that the guest
executed the SKINIT instruction. This information is
important because SKINIT is an SVM extenstion not yet
implemented by nested SVM and we may need this information
for debugging hypervisors that do not yet run on neste
From: Joerg Roedel
This patch adds a special tracepoint for the event that a
nested #vmexit is injected because kvm wants to inject an
interrupt into the guest.
Signed-off-by: Joerg Roedel
Signed-off-by: Marcelo Tosatti
---
arch/x86/kvm/svm.c |2 +-
arch/x86/kvm/trace.h | 18 +
Highlights:
- improved kernel context switching speed
- better interoperation with other users of virtualization extensions
- improved irq scaling
- nested svm improvements and tracing
- improved cpufreq integration
- spin loop detection on newer hardware
Notes:
- kvm/ppc64 support will be merged
From: Joerg Roedel
With all important informations now delivered through
tracepoints we can savely remove the nsvm_printk debugging
code for nested svm.
Signed-off-by: Joerg Roedel
Signed-off-by: Marcelo Tosatti
---
arch/x86/kvm/svm.c | 34 --
1 files changed
From: Joerg Roedel
This patch adds a tracepoint for the event that the guest
executed the INVLPGA instruction.
Signed-off-by: Joerg Roedel
Signed-off-by: Marcelo Tosatti
---
arch/x86/kvm/svm.c |3 +++
arch/x86/kvm/trace.h | 23 +++
arch/x86/kvm/x86.c |1 +
3
Trying to debug the cdrom issue, I see
Compiling whole program out/ccode32.o
src/util.c: In function ‘__end_thread’:
src/util.c:183: internal compiler error: in simplify_subreg, at
simplify-rtx.c:5055
(with F12's gcc (GCC) 4.4.2 20091027 (Red Hat 4.4.2-7))
The issue seems to be with the pos
Hi,
I just managed to push a qemu-kvm process (git rev. b496fe3431) into an
endless loop in qcow2_alloc_cluster_offset, namely over
QLIST_FOREACH(old_alloc, &s->cluster_allocs, next_in_flight):
(gdb) bt
#0 0x0048614b in qcow2_alloc_cluster_offset (bs=0xc4e1d0,
offset=7417184256, n_start
On 11/19/2009 05:24 AM, Kevin O'Connor wrote:
On Wed, Nov 18, 2009 at 12:19:20AM -0500, Kevin O'Connor wrote:
On Tue, Nov 17, 2009 at 03:21:31PM +0200, Avi Kivity wrote:
qemu-kvm's switch to seabios uncovered a regression with cdrom handling.
Vista x64 no longer recognizes the cdrom,
62 matches
Mail list logo