Re: Question on stopping KVM start at boot

2010-03-13 Thread satimis

Hi Dustin,


- snip -


Where shall I add -b option?  Thanks


modprobe -b says respect the blacklists.  See:
 * http://manpages.ubuntu.com/manpages/lucid/en/man8/modprobe.8.html

   -b --use-blacklist
  This  option  causes modprobe to apply the blacklist   
commands in
  the configuration files (if any) to module names as   
well. It  is

  usually used by udev(7).

So you would change the lines that say if modprobe ... to if   
modprobe -b ...



Your advice works for me.  Thanks


B.R.
Stephen




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] x86/kvm: Show guest system/user cputime in cpustat

2010-03-13 Thread Avi Kivity

On 03/12/2010 10:53 AM, Qing He wrote:



When Qing(CCed) was working on nested VMX in the past, he found PV
vmread/vmwrite indeed works well(it would write to the virtual vmcs so vmwrite
can also benefit). Though compared to old machine(one our internal patch shows
improve more than 5%), NHM get less benefit due to the reduced vmexit cost.

 

One of the hurdles to PVize vmread/vmwrite is the fact that the memory
layout of physical vmcs remains unknown. Of course it can use the custom
vmcs layout utilized by nested virtualization, but that looks a little weird,
since different nested virtualization implementation may create different
custom layout.
   


Note we must use a custom layout and cannot depend on the physical 
layout, due to live migration.  The layout becomes an ABI.



I once used another approach to partially accelerate the vmread/vmwrite
in nested virtualization case, which also gives good performance gain (around
7% on pre-nehalem, based on this, PV vmread/vmwrite had another 7%). That
is to make a shortcut to handle EXIT_REASON_VM{READ,WRITE}, without
even turning on the IF.
   


Interesting.  That means our exit path is inefficient; it seems to imply 
half the time is spent outside the hardware vmexit path.


A quick profile (on non-Nehalem) shows many atomics and calls into the 
lapic, as well as update_cr8_intercept which is sometimes unnecessary; 
these could easily be optimized.


Definitely optimizing the non-paravirt path is preferred to adding more 
paravirtualization.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Shadow page table questions

2010-03-13 Thread Avi Kivity

On 03/11/2010 06:14 PM, Marek Olszewski wrote:
It doesn't, and there are often multiple shadow pages per guest page, 
distinguished by their sp-role field. 
Oh, great!  Does this mean that there is already a mechanism for 
synchronizing all shadow pages shadowing the same guest when such a 
guest page changes?


Yes, kvm_mmu_pte_write().

--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: how to tweak kernel to get the best out of kvm?

2010-03-13 Thread Avi Kivity

On 03/11/2010 03:24 PM, Harald Dunkel wrote:

Hi Avi,

I had missed to include some important syslog lines from the
host system. See attachment.

On 03/10/10 14:15, Avi Kivity wrote:
   

You have tons of iowait time, indicating an I/O bottleneck.

 

Is this disk IO or network IO?


disk.


The rsync session puts a
high load on both, but actually I do not see how a high
load on disk or block IO could make the virtual hosts
unresponsive, as shown by the hosts syslog?

   


qcow2 is still not fully asynchronous, so sometimes when it waits, a 
vcpu waits as well.



Here the problem is likely the host filesystem and/or I/O scheduler.

The optimal layout is placing guest disks in LVM volumes, and accessing
them with -drive file=...,cache=none.  However, file-based access should
also work.

 

I will try LVM tomorrow, when the test with reiserfs is completed.

   


If the slowdown is indeed due to I/O, LVM (with cache=off) should 
eliminate it completely.


--
Do not meddle in the internals of kernels, for they are subtle and quick to 
panic.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw disks no longer work in latest kvm (kvm-88 was fine)

2010-03-13 Thread Antoine Martin

On 03/08/2010 02:35 AM, Avi Kivity wrote:

On 03/07/2010 09:25 PM, Antoine Martin wrote:

On 03/08/2010 02:17 AM, Avi Kivity wrote:

On 03/07/2010 09:13 PM, Antoine Martin wrote:

What version of glibc do you have installed?


Latest stable:
sys-devel/gcc-4.3.4
sys-libs/glibc-2.10.1-r1



$ git show glibc-2.10~108 | head
commit e109c6124fe121618e42ba882e2a0af6e97b8efc
Author: Ulrich Drepper drep...@redhat.com
Date:   Fri Apr 3 19:57:16 2009 +

* misc/Makefile (routines): Add preadv, preadv64, pwritev, 
pwritev64.


* misc/Versions: Export preadv, preadv64, pwritev, pwritev64 
for

GLIBC_2.10.
* misc/sys/uio.h: Declare preadv, preadv64, pwritev, pwritev64.
* sysdeps/unix/sysv/linux/kernel-features.h: Add entries for 
preadv


You might get away with rebuilding glibc against the 2.6.33 headers.

The latest kernel headers available in gentoo (and they're masked 
unstable):

sys-kernel/linux-headers-2.6.32

So I think I will just keep using Christoph's patch until .33 hits 
portage.

Unless there's any reason not to? I would rather keep my system clean.
I can try it though, if that helps you clear things up?


preadv/pwritev was actually introduced in 2.6.30.  Perhaps you last 
build glibc before that?  If so, a rebuild may be all that's necessary.



To be certain, I've rebuilt qemu-kvm against:
linux-headers-2.6.33 + glibc-2.10.1-r1 (both freshly built)
And still no go!
I'm still having to use the patch which disables preadv unconditionally...

Antoine
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/5] Fix some mmu/emulator atomicity issues (v2)

2010-03-13 Thread Avi Kivity

On 03/10/2010 04:50 PM, Avi Kivity wrote:

Currently when we emulate a locked operation into a shadowed guest page
table, we perform a write rather than a true atomic.  This is indicated
by the emulating exchange as write message that shows up in dmesg.

In addition, the pte prefetch operation during invlpg suffered from a
race.  This was fixed by removing the operation.

This patchset fixes both issues and reinstates pte prefetch on invlpg.

v2:
- fix truncated description for patch 1
- add new patch 4, which fixes a bug in patch 5
   


No comments, but looks like last week's maintainer neglected to merge this.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make QEmu HPET disabled by default for KVM?

2010-03-13 Thread Avi Kivity

On 03/11/2010 09:08 PM, Marcelo Tosatti wrote:





I have kept --no-hpet in my setup for
months...
   

Any details about the problems?  HPET is important to some guests.
 

As Gleb mentioned in the other thread, reinjection will introduce
another set of problems.

Ideally all this timer related problems should be fixed by correlating
timer interrupts and time source reads.
   


This still needs reinjection (or slewing of the timer frequency).  
Correlation doesn't fix drift.



Since one already has to use special timer parameters (-rtc-td-hack,
-no-kvm-pit-reinjection), using -no-hpet for problematic Linux
guests seems fine?
   


Depends on how common the problematic ones are.  If they're common, 
better to have a generic fix.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Make QEmu HPET disabled by default for KVM?

2010-03-13 Thread Gleb Natapov
On Sun, Mar 14, 2010 at 09:05:50AM +0200, Avi Kivity wrote:
 On 03/11/2010 09:08 PM, Marcelo Tosatti wrote:
 
 
 I have kept --no-hpet in my setup for
 months...
 Any details about the problems?  HPET is important to some guests.
 As Gleb mentioned in the other thread, reinjection will introduce
 another set of problems.
 
 Ideally all this timer related problems should be fixed by correlating
 timer interrupts and time source reads.
 
 This still needs reinjection (or slewing of the timer frequency).
 Correlation doesn't fix drift.
 
But only when all time sources are synchronised and correlated with
interrupts we can slew time frequency without guest noticing (and only
if guest disables NTP)

 Since one already has to use special timer parameters (-rtc-td-hack,
 -no-kvm-pit-reinjection), using -no-hpet for problematic Linux
 guests seems fine?
 
 Depends on how common the problematic ones are.  If they're common,
 better to have a generic fix.
 
 -- 
 error compiling committee.c: too many arguments to function

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raw disks no longer work in latest kvm (kvm-88 was fine)

2010-03-13 Thread Avi Kivity

On 03/13/2010 11:51 AM, Antoine Martin wrote:
preadv/pwritev was actually introduced in 2.6.30.  Perhaps you last 
build glibc before that?  If so, a rebuild may be all that's necessary.




To be certain, I've rebuilt qemu-kvm against:
linux-headers-2.6.33 + glibc-2.10.1-r1 (both freshly built)
And still no go!
I'm still having to use the patch which disables preadv 
unconditionally...


What does strace show?  Is the kernel's preadv called?

Maybe you have a glibc that has broken emulated preadv and no kernel 
preadv support.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/3] target-i386: print EFER in cpu_dump_state

2010-03-13 Thread Avi Kivity

On 03/11/2010 08:53 PM, Marcelo Tosatti wrote:

On Thu, Mar 11, 2010 at 10:35:21AM +0200, Avi Kivity wrote:
   

On 03/09/2010 03:53 AM, Marcelo Tosatti wrote:
 

Signed-off-by: Marcelo Tosattimtosa...@redhat.com

Index: qemu-kvm-uq/target-i386/helper.c
===
--- qemu-kvm-uq.orig/target-i386/helper.c
+++ qemu-kvm-uq/target-i386/helper.c
@@ -1176,6 +1176,7 @@ void cpu_dump_state(CPUState *env, FILE
  cpu_x86_dump_seg_cache(env, f, cpu_fprintf, TR,env-tr);

  #ifdef TARGET_X86_64
+cpu_fprintf(f, EFER=%016 PRIx64 \n, env-efer);
  if (env-hflags   HF_LMA_MASK) {
  cpu_fprintf(f, GDT= %016 PRIx64  %08x\n,
  env-gdt.base, env-gdt.limit);

   

Better to do this for i386 too, no?
 

On systems that support IA-32e mode, the extended feature enable
register (IA32_EFER) is available. This model-specific register controls
activation of IA-32e mode and other IA-32e mode operations.

Can it be useful for i386 too?
   


That's on Intel.  AMDs had EFER before 64-bit support (for syscall 
support, and nx), IIRC.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html