Re: [PATCH] powerpc/kvm/cma: Fix panic introduces by signed shift operation

2014-09-03 Thread Paolo Bonzini
Il 02/09/2014 18:13, Laurent Dufour ha scritto:
 fc95ca7284bc54953165cba76c3228bd2cdb9591 introduces a memset in
 kvmppc_alloc_hpt since the general CMA doesn't clear the memory it
 allocates.
 
 However, the size argument passed to memset is computed from a signed value
 and its signed bit is extended by the cast the compiler is doing. This lead
 to extremely large size value when dealing with order value = 31, and
 almost all the memory following the allocated space is cleaned. As a
 consequence, the system is panicing and may even fail spawning the kdump
 kernel.
 
 This fix makes use of an unsigned value for the memset's size argument to
 avoid sign extension. Among this fix, another shift operation which may
 lead to signed extended value too is also fixed.
 
 Cc: Alexey Kardashevskiy a...@ozlabs.ru
 Cc: Paul Mackerras pau...@samba.org
 Cc: Alexander Graf ag...@suse.de
 Cc: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 Cc: Joonsoo Kim iamjoonsoo@lge.com
 Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
 Signed-off-by: Laurent Dufour lduf...@linux.vnet.ibm.com
 ---
  arch/powerpc/kvm/book3s_64_mmu_hv.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
 b/arch/powerpc/kvm/book3s_64_mmu_hv.c
 index 72c20bb16d26..79294c4c5015 100644
 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
 +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
 @@ -62,10 +62,10 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
   }
  
   kvm-arch.hpt_cma_alloc = 0;
 - page = kvm_alloc_hpt(1  (order - PAGE_SHIFT));
 + page = kvm_alloc_hpt(1ul  (order - PAGE_SHIFT));
   if (page) {
   hpt = (unsigned long)pfn_to_kaddr(page_to_pfn(page));
 - memset((void *)hpt, 0, (1  order));
 + memset((void *)hpt, 0, (1ul  order));
   kvm-arch.hpt_cma_alloc = 1;
   }
  
 

Thanks, applied to kvm/master.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/2] KVM: PPC: e500mc: Add support for single threaded vcpus on e6500 core

2014-09-03 Thread Alexander Graf


On 01.09.14 11:01, Mihai Caraman wrote:
 ePAPR represents hardware threads as cpu node properties in device tree.
 So with existing QEMU, hardware threads are simply exposed as vcpus with
 one hardware thread.
 
 The e6500 core shares TLBs between hardware threads. Without tlb write
 conditional instruction, the Linux kernel uses per core mechanisms to
 protect against duplicate TLB entries.
 
 The guest is unable to detect real siblings threads, so it can't use the
 TLB protection mechanism. An alternative solution is to use the hypervisor
 to allocate different lpids to guest's vcpus that runs simultaneous on real
 siblings threads. On systems with two threads per core this patch halves
 the size of the lpid pool that the allocator sees and use two lpids per VM.
 Use even numbers to speedup vcpu lpid computation with consecutive lpids
 per VM: vm1 will use lpids 2 and 3, vm2 lpids 4 and 5, and so on.
 
 Signed-off-by: Mihai Caraman mihai.cara...@freescale.com

Thanks, applied both to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Bug 82211] Cannot boot Xen under KVM with X2APIC enabled

2014-09-03 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=82211

--- Comment #9 from Paolo Bonzini bonz...@gnu.org ---
Nope, your binary works with kvm/queue for me:

/sys/module/kvm_intel/parameters/emulate_invalid_guest_state:Y
/sys/module/kvm_intel/parameters/enable_apicv:N
/sys/module/kvm_intel/parameters/enable_shadow_vmcs:N
/sys/module/kvm_intel/parameters/ept:Y
/sys/module/kvm_intel/parameters/eptad:N
/sys/module/kvm_intel/parameters/fasteoi:Y
/sys/module/kvm_intel/parameters/flexpriority:Y
/sys/module/kvm_intel/parameters/nested:Y
/sys/module/kvm_intel/parameters/ple_gap:128
/sys/module/kvm_intel/parameters/ple_window:4096
/sys/module/kvm_intel/parameters/ple_window_grow:2
/sys/module/kvm_intel/parameters/ple_window_max:1073741823
/sys/module/kvm_intel/parameters/ple_window_shrink:0
/sys/module/kvm_intel/parameters/unrestricted_guest:Y
/sys/module/kvm_intel/parameters/vmm_exclusive:Y
/sys/module/kvm_intel/parameters/vpid:Y

I unzipped it, and invoked QEMU with

qemu-kvm -kernel ./xen -initrd /boot/vmlinuz-2.6.18-348.el5xen -cpu kvm64

(Any initrd will do).

-- 
You are receiving this mail because:
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 00/14] ivshmem: update documentation, add client/server tools

2014-09-03 Thread David Marchand

Hello Eric,

On 09/02/2014 10:31 PM, Eric Blake wrote:

On 09/02/2014 09:25 AM, David Marchand wrote:

Here is a patchset containing an update on ivshmem specs documentation and
importing ivshmem server and client tools.
These tools have been written from scratch and are not related to what is
available in nahanni repository.
I put them in contrib/ directory as the qemu-doc.texi was already telling the
server was supposed to be there.

Changes since v3:
- first patch is untouched
- just restored the Reviewed-By Claudio in second patch
- following patches 3-8 take into account Stefan's comments
- patches 9-12 take into account Gonglei's comments
- patch 13 adjusts ivshmem-server default values
- last patch introduces a change in the ivshmem client-server protocol to
   check a protocol version at connect time


Rather than introducing new files with bugs, followed by patches to
clean it up, why not just introduce the new files correct in the first
place?  I think you are better off squashing in a lot of the cleanup
patches into patch 1.


Actually, I mentioned this in a previous email but did not get any comment.
So, I preferred to send the splitted patches to ease review (from my 
point of view).


Once code looks fine enough, I intend to keep only three patches :
- one for the initial import of ivshmem-client / server
- one for the documentation update
- one last with the protocol change

Is it okay this way ?


--
David Marchand
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-unit-test failures

2014-09-03 Thread Paolo Bonzini
Il 02/09/2014 21:57, Chris J Arges ha scritto:
  Can you please trace the test using trace-cmd
  (http://www.linux-kvm.org/page/Tracing) and send the output?
  
  Paolo
  
 Paolo,
 
 I have posted the trace data here:
 http://people.canonical.com/~arges/kvm/trace.dat.xz

Can you try running the test again (no need to get a new trace) with
clocksource=hpet on the kernel command line?

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-unit-test failures

2014-09-03 Thread Paolo Bonzini
Il 02/09/2014 21:57, Chris J Arges ha scritto:
 Seconds get from host: 1409687073
 Seconds get from kvmclock: 1409333034
 Offset:-354039
 offset too large!
 Check the stability of raw cycle ...
 Worst warp -354462672821748
 Total vcpus: 2
 Test  loops: 1000
 Total warps:  1
 Total stalls: 0
 Worst warp:   -354462672821748
 Raw cycle is not stable
 Monotonic cycle test:
 Worst warp -354455286691490

Looks like one CPU is not being initialized correctly:

- The next correction in the trace is 18445647546048704244,
  and (next-2^64) / -354039 is about 3.1*10^9.  This is a pretty
  plausible value of the TSC frequency.  As a comparison, on my machine
  I have next=18446366988261784997 and an uptime of 29:12 hours, and
  the two match nicely with the CPU clock:

  -(18446366988261784997-2^64) / (29.2 * 3600 * 10^9) = 3.587

  $ grep -m1 model.name /proc/cpuinfo
  model name: Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz

- The offset in seconds * 10^9 is pretty close to the warp in nanoseconds.

Can you: 1) try this patch 2) gather a new trace 3) include uptime and
cpuinfo in your report?  All this without clocksource=hpet of course.

Thanks,

Paolo

diff --git a/x86/kvmclock_test.c b/x86/kvmclock_test.c
index 52a43fb..f68881c 100644
--- a/x86/kvmclock_test.c
+++ b/x86/kvmclock_test.c
@@ -7,6 +7,9 @@
 #define DEFAULT_TEST_LOOPS 1L
 #define DEFAULT_THRESHOLD  5L
 
+long threshold = DEFAULT_THRESHOLD;
+int nerr = 0;
+
 struct test_info {
 struct spinlock lock;
 long loops;   /* test loops */
@@ -20,8 +23,9 @@ struct test_info {
 
 struct test_info ti[4];
 
-static int wallclock_test(long sec, long threshold)
+static void wallclock_test(void *p_sec)
 {
+   long sec = *(long *)p_sec;
 long ksec, offset;
 struct timespec ts;
 
@@ -36,10 +40,8 @@ static int wallclock_test(long sec, long threshold)
 
 if (offset  threshold || offset  -threshold) {
 printf(offset too large!\n);
-return 1;
+nerr++;
 }
-
-return 0;
 }
 
 static void kvm_clock_test(void *data)
@@ -116,10 +118,9 @@ static int cycle_test(int ncpus, long loops, int check, 
struct test_info *ti)
 int main(int ac, char **av)
 {
 int ncpus;
-int nerr = 0, i;
+int i;
 long loops = DEFAULT_TEST_LOOPS;
 long sec = 0;
-long threshold = DEFAULT_THRESHOLD;
 
 if (ac  1)
 loops = atol(av[1]);
@@ -137,7 +138,8 @@ int main(int ac, char **av)
 on_cpu(i, kvm_clock_init, (void *)0);
 
 if (ac  2)
-nerr += wallclock_test(sec, threshold);
+   for (i = 0; i  ncpus; ++i)
+   on_cpu(i, wallclock_test, sec);
 
 printf(Check the stability of raw cycle ...\n);
 pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 4/6] kvm, mem-hotplug: Reload L1' apic access page on migration in vcpu_enter_guest().

2014-09-03 Thread Gleb Natapov
On Wed, Sep 03, 2014 at 09:42:30AM +0800, tangchen wrote:
 Hi Gleb,
 
 On 09/03/2014 12:00 AM, Gleb Natapov wrote:
 ..
 +static void vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
 +{
 +/*
 + * apic access page could be migrated. When the page is being migrated,
 + * GUP will wait till the migrate entry is replaced with the new pte
 + * entry pointing to the new page.
 + */
 +vcpu-kvm-arch.apic_access_page = gfn_to_page(vcpu-kvm,
 +APIC_DEFAULT_PHYS_BASE  PAGE_SHIFT);
 +kvm_x86_ops-set_apic_access_page_addr(vcpu-kvm,
 +page_to_phys(vcpu-kvm-arch.apic_access_page));
 I am a little bit worried that here all vcpus write to 
 vcpu-kvm-arch.apic_access_page
 without any locking. It is probably benign since pointer write is atomic on 
 x86. Paolo?
 
 Do we even need apic_access_page? Why not call
   gfn_to_page(APIC_DEFAULT_PHYS_BASE  PAGE_SHIFT)
   put_page()
 on rare occasions we need to know its address?
 
 Isn't it a necessary item defined in hardware spec ?
 
vcpu-kvm-arch.apic_access_page? No. This is internal kvm data structure.

 I didn't read intel spec deeply, but according to the code, the page's
 address is
 written into vmcs. And it made me think that we cannnot remove it.
 
We cannot remove writing of apic page address into vmcs, but this is not done by
assigning to vcpu-kvm-arch.apic_access_page, but by vmwrite in 
set_apic_access_page_addr().

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 5/6] kvm, mem-hotplug: Reload L1's apic access page on migration when L2 is running.

2014-09-03 Thread Gleb Natapov
On Wed, Aug 27, 2014 at 06:17:40PM +0800, Tang Chen wrote:
 This patch only handle L1 and L2 vm share one apic access page situation.
 
 When L1 vm is running, if the shared apic access page is migrated, 
 mmu_notifier will
 request all vcpus to exit to L0, and reload apic access page physical address 
 for
 all the vcpus' vmcs (which is done by patch 5/6). And when it enters L2 vm, 
 L2's vmcs
 will be updated in prepare_vmcs02() called by nested_vm_run(). So we need to 
 do
 nothing.
 
 When L2 vm is running, if the shared apic access page is migrated, 
 mmu_notifier will
 request all vcpus to exit to L0, and reload apic access page physical address 
 for
 all L2 vmcs. And this patch requests apic access page reload in L2-L1 vmexit.
 
 Signed-off-by: Tang Chen tangc...@cn.fujitsu.com
 ---
  arch/x86/include/asm/kvm_host.h |  1 +
  arch/x86/kvm/svm.c  |  6 ++
  arch/x86/kvm/vmx.c  | 32 
  arch/x86/kvm/x86.c  |  3 +++
  virt/kvm/kvm_main.c |  1 +
  5 files changed, 43 insertions(+)
 
 diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
 index 514183e..13fbb62 100644
 --- a/arch/x86/include/asm/kvm_host.h
 +++ b/arch/x86/include/asm/kvm_host.h
 @@ -740,6 +740,7 @@ struct kvm_x86_ops {
   void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
   void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
   void (*set_apic_access_page_addr)(struct kvm *kvm, hpa_t hpa);
 + void (*set_nested_apic_page_migrated)(struct kvm_vcpu *vcpu, bool set);
   void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
   void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
   int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
 diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
 index f2eacc4..da88646 100644
 --- a/arch/x86/kvm/svm.c
 +++ b/arch/x86/kvm/svm.c
 @@ -3624,6 +3624,11 @@ static void svm_set_apic_access_page_addr(struct kvm 
 *kvm, hpa_t hpa)
   return;
  }
  
 +static void svm_set_nested_apic_page_migrated(struct kvm_vcpu *vcpu, bool 
 set)
 +{
 + return;
 +}
 +
  static int svm_vm_has_apicv(struct kvm *kvm)
  {
   return 0;
 @@ -4379,6 +4384,7 @@ static struct kvm_x86_ops svm_x86_ops = {
   .update_cr8_intercept = update_cr8_intercept,
   .set_virtual_x2apic_mode = svm_set_virtual_x2apic_mode,
   .set_apic_access_page_addr = svm_set_apic_access_page_addr,
 + .set_nested_apic_page_migrated = svm_set_nested_apic_page_migrated,
   .vm_has_apicv = svm_vm_has_apicv,
   .load_eoi_exitmap = svm_load_eoi_exitmap,
   .hwapic_isr_update = svm_hwapic_isr_update,
 diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
 index da6d55d..9035fd1 100644
 --- a/arch/x86/kvm/vmx.c
 +++ b/arch/x86/kvm/vmx.c
 @@ -379,6 +379,16 @@ struct nested_vmx {
* we must keep them pinned while L2 runs.
*/
   struct page *apic_access_page;
 + /*
 +  * L1's apic access page can be migrated. When L1 and L2 are sharing
 +  * the apic access page, after the page is migrated when L2 is running,
 +  * we have to reload it to L1 vmcs before we enter L1.
 +  *
 +  * When the shared apic access page is migrated in L1 mode, we don't
 +  * need to do anything else because we reload apic access page each
 +  * time when entering L2 in prepare_vmcs02().
 +  */
 + bool apic_access_page_migrated;
   u64 msr_ia32_feature_control;
  
   struct hrtimer preemption_timer;
 @@ -7098,6 +7108,12 @@ static void vmx_set_apic_access_page_addr(struct kvm 
 *kvm, hpa_t hpa)
   vmcs_write64(APIC_ACCESS_ADDR, hpa);
  }
  
 +static void vmx_set_nested_apic_page_migrated(struct kvm_vcpu *vcpu, bool 
 set)
 +{
 + struct vcpu_vmx *vmx = to_vmx(vcpu);
 + vmx-nested.apic_access_page_migrated = set;
 +}
 +
  static void vmx_hwapic_isr_update(struct kvm *kvm, int isr)
  {
   u16 status;
 @@ -8796,6 +8812,21 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, 
 u32 exit_reason,
   }
  
   /*
 +  * When shared (L1  L2) apic access page is migrated during L2 is
 +  * running, mmu_notifier will force to reload the page's hpa for L2
 +  * vmcs. Need to reload it for L1 before entering L1.
 +  */
 + if (vmx-nested.apic_access_page_migrated) {
 + /*
 +  * Do not call kvm_reload_apic_access_page() because we are now
 +  * in L2. We should not call make_all_cpus_request() to exit to
 +  * L0, otherwise we will reload for L2 vmcs again.
 +  */
 + kvm_reload_apic_access_page(vcpu-kvm);
 + vmx-nested.apic_access_page_migrated = false;
 + }
I would just call kvm_reload_apic_access_page() unconditionally and only if
it will prove to be performance problem would optimize it further. Vmexit 
emulation it
pretty heavy, so I doubt one more vmwrite will be noticeable.

--

Re: kvm-unit-test failures

2014-09-03 Thread Chris J Arges


On 09/03/2014 09:47 AM, Paolo Bonzini wrote:
 Il 02/09/2014 21:57, Chris J Arges ha scritto:
 Can you please trace the test using trace-cmd
 (http://www.linux-kvm.org/page/Tracing) and send the output?

 Paolo

 Paolo,

 I have posted the trace data here:
 http://people.canonical.com/~arges/kvm/trace.dat.xz
 
 Can you try running the test again (no need to get a new trace) with
 clocksource=hpet on the kernel command line?
 
 Paolo
 
 ./x86-run x86/kvmclock_test.flat -smp 2 --append 1000 `date +%s`
qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
1000 1409757645
enabling apic
enabling apic
kvm-clock: cpu 0, msr 0x:44d4c0
kvm-clock: cpu 0, msr 0x:44d4c0
Wallclock test, threshold 5
Seconds get from host: 1409757645
Seconds get from kvmclock: 1409757645
Offset:0
Check the stability of raw cycle ...
Total vcpus: 2
Test  loops: 1000
Total warps:  0
Total stalls: 0
Worst warp:   0
Raw cycle is stable
Monotonic cycle test:
Total vcpus: 2
Test  loops: 1000
Total warps:  0
Total stalls: 0
Worst warp:   0
Measure the performance of raw cycle ...
Total vcpus: 2
Test  loops: 1000
TSC cycles:  1106490299
Measure the performance of adjusted cycle ...
Total vcpus: 2
Test  loops: 1000
TSC cycles:  3463433372
Return value from qemu: 1

Ok this passes, I'll now try the patch without the cmdline option.
--chris
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 00/14] ivshmem: update documentation, add client/server tools

2014-09-03 Thread Eric Blake
On 09/03/2014 07:01 AM, David Marchand wrote:

 Rather than introducing new files with bugs, followed by patches to
 clean it up, why not just introduce the new files correct in the first
 place?  I think you are better off squashing in a lot of the cleanup
 patches into patch 1.
 
 Actually, I mentioned this in a previous email but did not get any comment.
 So, I preferred to send the splitted patches to ease review (from my
 point of view).

It does not ease reviewer time to have a known buggy patch with later
cleanups.  I'd rather see your best effort at a bug-free patch to begin
with, than to spend my time pointing out bugs only to find out you
already fixed them later in the series.

 
 Once code looks fine enough, I intend to keep only three patches :
 - one for the initial import of ivshmem-client / server
 - one for the documentation update
 - one last with the protocol change

If that is your plan for the final series, then that is the same plan
you should be using for reviews.  You want the reviewers to see your
proposed final product, not your intermediate state of how you got there.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: kvm-unit-test failures

2014-09-03 Thread Chris J Arges


On 09/03/2014 09:59 AM, Paolo Bonzini wrote:
 Il 02/09/2014 21:57, Chris J Arges ha scritto:
 Seconds get from host: 1409687073
 Seconds get from kvmclock: 1409333034
 Offset:-354039
 offset too large!
 Check the stability of raw cycle ...
 Worst warp -354462672821748
 Total vcpus: 2
 Test  loops: 1000
 Total warps:  1
 Total stalls: 0
 Worst warp:   -354462672821748
 Raw cycle is not stable
 Monotonic cycle test:
 Worst warp -354455286691490
 
 Looks like one CPU is not being initialized correctly:
 
 - The next correction in the trace is 18445647546048704244,
   and (next-2^64) / -354039 is about 3.1*10^9.  This is a pretty
   plausible value of the TSC frequency.  As a comparison, on my machine
   I have next=18446366988261784997 and an uptime of 29:12 hours, and
   the two match nicely with the CPU clock:
 
   -(18446366988261784997-2^64) / (29.2 * 3600 * 10^9) = 3.587
 
   $ grep -m1 model.name /proc/cpuinfo
   model name  : Intel(R) Xeon(R) CPU E5-1620 0 @ 3.60GHz
 
 - The offset in seconds * 10^9 is pretty close to the warp in nanoseconds.
 
 Can you: 1) try this patch 2) gather a new trace 3) include uptime and
 cpuinfo in your report?  All this without clocksource=hpet of course.
 
 Thanks,
 
 Paolo
 
 diff --git a/x86/kvmclock_test.c b/x86/kvmclock_test.c
 index 52a43fb..f68881c 100644
 --- a/x86/kvmclock_test.c
 +++ b/x86/kvmclock_test.c
 @@ -7,6 +7,9 @@
  #define DEFAULT_TEST_LOOPS 1L
  #define DEFAULT_THRESHOLD  5L
  
 +long threshold = DEFAULT_THRESHOLD;
 +int nerr = 0;
 +
  struct test_info {
  struct spinlock lock;
  long loops;   /* test loops */
 @@ -20,8 +23,9 @@ struct test_info {
  
  struct test_info ti[4];
  
 -static int wallclock_test(long sec, long threshold)
 +static void wallclock_test(void *p_sec)
  {
 + long sec = *(long *)p_sec;
  long ksec, offset;
  struct timespec ts;
  
 @@ -36,10 +40,8 @@ static int wallclock_test(long sec, long threshold)
  
  if (offset  threshold || offset  -threshold) {
  printf(offset too large!\n);
 -return 1;
 +nerr++;
  }
 -
 -return 0;
  }
  
  static void kvm_clock_test(void *data)
 @@ -116,10 +118,9 @@ static int cycle_test(int ncpus, long loops, int check, 
 struct test_info *ti)
  int main(int ac, char **av)
  {
  int ncpus;
 -int nerr = 0, i;
 +int i;
  long loops = DEFAULT_TEST_LOOPS;
  long sec = 0;
 -long threshold = DEFAULT_THRESHOLD;
  
  if (ac  1)
  loops = atol(av[1]);
 @@ -137,7 +138,8 @@ int main(int ac, char **av)
  on_cpu(i, kvm_clock_init, (void *)0);
  
  if (ac  2)
 -nerr += wallclock_test(sec, threshold);
 + for (i = 0; i  ncpus; ++i)
 + on_cpu(i, wallclock_test, sec);
  
  printf(Check the stability of raw cycle ...\n);
  pvclock_set_flags(PVCLOCK_TSC_STABLE_BIT
 

Here are the results of the trace as you requested:
http://people.canonical.com/~arges/kvm/trace-2.dat.xz

$ uptime
 16:18:31 up 53 min,  1 user,  load average: 1.16, 0.39, 0.17

$ grep -m1 model.name /proc/cpuinfo
model name  : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz

Here is the output of the command:
qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
1000 1409761075
enabling apic
enabling apic
kvm-clock: cpu 0, msr 0x:44e520
kvm-clock: cpu 0, msr 0x:44e520
Wallclock test, threshold 5
Seconds get from host: 1409761075
Seconds get from kvmclock: 1409757927
Offset:-3148
offset too large!
Wallclock test, threshold 5
Seconds get from host: 1409761075
Seconds get from kvmclock: 1409757927
Offset:-3148
offset too large!
Check the stability of raw cycle ...
Worst warp -3147762665310
Total vcpus: 2
Test  loops: 1000
Total warps:  1
Total stalls: 0
Worst warp:   -3147762665310
Raw cycle is not stable
Monotonic cycle test:
Worst warp -3142929472775
Total vcpus: 2
Test  loops: 1000
Total warps:  1
Total stalls: 0
Worst warp:   -3142929472775
Measure the performance of raw cycle ...
Total vcpus: 2
Test  loops: 1000
TSC cycles:  1242044050
Measure the performance of adjusted cycle ...
Total vcpus: 2
Test  loops: 1000
TSC cycles:  1242665486
Return value from qemu: 3

--chris
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-unit-test failures

2014-09-03 Thread Paolo Bonzini
Il 03/09/2014 18:23, Chris J Arges ha scritto:
 $ uptime
  16:18:31 up 53 min,  1 user,  load average: 1.16, 0.39, 0.17
 
 $ grep -m1 model.name /proc/cpuinfo
 model name  : Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz
 
 Here is the output of the command:
 qemu-system-x86_64 -enable-kvm -device pc-testdev -device
 isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
 -device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
 1000 1409761075
 enabling apic
 enabling apic
 kvm-clock: cpu 0, msr 0x:44e520
 kvm-clock: cpu 0, msr 0x:44e520
 Wallclock test, threshold 5
 Seconds get from host: 1409761075
 Seconds get from kvmclock: 1409757927
 Offset:-3148
 offset too large!
 Wallclock test, threshold 5
 Seconds get from host: 1409761075
 Seconds get from kvmclock: 1409757927
 Offset:-3148
 offset too large!
 Check the stability of raw cycle ...
 Worst warp -3147762665310

I'm not sure about the reason for the warp, but indeed the offset and
uptime match (I'll check them against the trace tomorrow) so it's just
that the VM's TSC base is not taken into account correctly.

Can you gather another trace with the problematic patch reverted?

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm-unit-test failures

2014-09-03 Thread Chris J Arges
snip
 I'm not sure about the reason for the warp, but indeed the offset and
 uptime match (I'll check them against the trace tomorrow) so it's just
 that the VM's TSC base is not taken into account correctly.
 
 Can you gather another trace with the problematic patch reverted?
 
 Paolo
 

Here is the third trace running with 0d3da0d2 reverted from the latest
kvm queue branch 11cc9ea3:

http://people.canonical.com/~arges/kvm/trace-3.dat.xz

$ uptime
 18:25:13 up 5 min,  1 user,  load average: 0.21, 0.74, 0.44

qemu-system-x86_64 -enable-kvm -device pc-testdev -device
isa-debug-exit,iobase=0xf4,iosize=0x4 -display none -serial stdio
-device pci-testdev -kernel x86/kvmclock_test.flat -smp 2 --append
1000 1409768537
enabling apic
enabling apic
kvm-clock: cpu 0, msr 0x:44e520
kvm-clock: cpu 0, msr 0x:44e520
Wallclock test, threshold 5
Seconds get from host: 1409768537
Seconds get from kvmclock: 1409768538
Offset:1
Wallclock test, threshold 5
Seconds get from host: 1409768537
Seconds get from kvmclock: 1409768538
Offset:1
Check the stability of raw cycle ...
Total vcpus: 2
Test  loops: 1000
Total warps:  0
Total stalls: 0
Worst warp:   0
Raw cycle is stable
Monotonic cycle test:
Total vcpus: 2
Test  loops: 1000
Total warps:  0
Total stalls: 0
Worst warp:   0
Measure the performance of raw cycle ...
Total vcpus: 2
Test  loops: 1000
TSC cycles:  1241970306
Measure the performance of adjusted cycle ...
Total vcpus: 2
Test  loops: 1000
TSC cycles:  3266701026
Return value from qemu: 1
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KVM: x86: fix TSC matching

2014-09-03 Thread Marcelo Tosatti
On Tue, Aug 26, 2014 at 12:08:32PM +0300, Pekka Enberg wrote:
 On Sun, Aug 17, 2014 at 11:54 AM, Paolo Bonzini pbonz...@redhat.com wrote:
  Il 15/08/2014 18:54, Marcelo Tosatti ha scritto:
 
  Ping on integration.
 
  It's been in kvm/next for a while, and is now in Linus's tree:
 
 Does this make sense for -stable too?
 
 - Pekka

Yes.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH v2] KVM: x86: keep eoi exit bitmap accurate before loading it.

2014-09-03 Thread Wang, Wei W
 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On
 Behalf Of Wang, Wei W
 Sent: Friday, August 29, 2014 8:59 AM
 To: Paolo Bonzini; Zhang, Yang Z; kvm@vger.kernel.org
 Cc: alex.william...@redhat.com
 Subject: RE: [PATCH v2] KVM: x86: keep eoi exit bitmap accurate before
 loading it.
 
 I think we can think about it for another couple of days and see if any corner
 case is not covered.
 
 Wei

Hi Paolo, we have not found any corner case. Do you still have any concerns? If 
not, I can email out the two patches with your suggested commit messages. 
If anyone find a bug in the future, we can also get back to solve it.

Wei
 
  -Original Message-
  From: Paolo Bonzini [mailto:pbonz...@redhat.com]
  Sent: Thursday, August 28, 2014 7:01 PM
  To: Wang, Wei W; Zhang, Yang Z; kvm@vger.kernel.org
  Cc: alex.william...@redhat.com
  Subject: Re: [PATCH v2] KVM: x86: keep eoi exit bitmap accurate before
  loading it.
 
  Il 28/08/2014 12:14, Wang, Wei W ha scritto:
   We will do some more tests on it to make sure there are no problems.
 
  No, I don't think there are any easily-detected practical problems
  with the patch.  But I'm not sure I understand all the theoretical
  problems and whether possible races are benign.  These would be really
  hard to debug, unless you get a bug that is 100% reproducible.
 
  Paolo
 --
 To unsubscribe from this list: send the line unsubscribe kvm in the body of 
 a
 message to majord...@vger.kernel.org More majordomo info at
 http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [question] virtio-blk performancedegradationhappened with virito-serial

2014-09-03 Thread Zhang Haoyu
  Hi, all
  
  I start a VM with virtio-serial (default ports number: 31), and found 
  that virtio-blk performance degradation happened, about 25%, this 
  problem can be reproduced 100%.
  without virtio-serial:
  4k-read-random 1186 IOPS
  with virtio-serial:
  4k-read-random 871 IOPS
  
  but if use max_ports=2 option to limit the max number of virio-serial 
  ports, then the IO performance degradation is not so serious, about 5%.
  
  And, ide performance degradation does not happen with virtio-serial.
 
 Pretty sure it's related to MSI vectors in use.  It's possible that
 the virtio-serial device takes up all the avl vectors in the guests,
 leaving old-style irqs for the virtio-blk device.
 
 I don't think so,
 I use iometer to test 64k-read(or write)-sequence case, if I disable the 
 virtio-serial dynamically via device manager-virtio-serial = disable,
 then the performance get promotion about 25% immediately, then I re-enable 
 the virtio-serial via device manager-virtio-serial = enable,
 the performance got back again, very obvious.
 add comments:
 Although the virtio-serial is enabled, I don't use it at all, the 
 degradation still happened.

Using the vectors= option as mentioned below, you can restrict the
number of MSI vectors the virtio-serial device gets.  You can then
confirm whether it's MSI that's related to these issues.

I use -device virtio-serial,vectors=4 instead of -device virtio-serial, but 
the degradation still happened, nothing changed.
with virtio-serial enabled:
64k-write-sequence: 4200 IOPS
with virtio-serial disabled:
64k-write-sequence: 5300 IOPS

How to confirm whether it's MSI in windows?

Thanks,
Zhang Haoyu

 So, I think it has no business with legacy interrupt mode, right?
 
 I am going to observe the difference of perf top data on qemu and perf kvm 
 stat data when disable/enable virtio-serial in guest,
 and the difference of perf top data on guest when disable/enable 
 virtio-serial in guest,
 any ideas?
 
 Thanks,
 Zhang Haoyu
 If you restrict the number of vectors the virtio-serial device gets
 (using the -device virtio-serial-pci,vectors= param), does that make
 things better for you?

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [question] e1000 interrupt stormhappenedbecauseofits correspondingioapic-irr bit always set

2014-09-03 Thread Zhang Haoyu
 Hi Jason,
 I tested below patch, it's okay, the e1000 interrupt storm disappeared.
 But I am going to make a bit change on it, could you help review it?
 
 Currently, we call ioapic_service() immediately when we find the irq is 
 still
 active during eoi broadcast. But for real hardware, there's some dealy 
 between
 the EOI writing and irq delivery (system bus latency?). So we need to 
 emulate
 this behavior. Otherwise, for a guest who haven't register a proper irq 
 handler
 , it would stay in the interrupt routine as this irq would be re-injected
 immediately after guest enables interrupt. This would lead guest can't move
 forward and may miss the possibility to get proper irq handler registered 
 (one
 example is windows guest resuming from hibernation).
 
 As there's no way to differ the unhandled irq from new raised ones, this 
 patch
 solve this problems by scheduling a delayed work when the count of irq 
 injected
 during eoi broadcast exceeds a threshold value. After this patch, the guest 
 can
 move a little forward when there's no suitable irq handler in case it may
 register one very soon and for guest who has a bad irq detection routine ( 
 such
 as note_interrupt() in linux ), this bad irq would be recognized soon as in 
 the
 past.
 
 Signed-off-by: Jason Wang jasowang at redhat.com
 ---
  virt/kvm/ioapic.c |   47 +--
  virt/kvm/ioapic.h |2 ++
  2 files changed, 47 insertions(+), 2 deletions(-)
 
 diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
 index dcaf272..892253e 100644
 --- a/virt/kvm/ioapic.c
 +++ b/virt/kvm/ioapic.c
  at  at  -221,6 +221,24  at  at  int kvm_ioapic_set_irq(struct 
  kvm_ioapic *ioapic, int irq, int level)
 return ret;
  }
 
 +static void kvm_ioapic_eoi_inject_work(struct work_struct *work)
 +{
 +   int i, ret;
 +   struct kvm_ioapic *ioapic = container_of(work, struct kvm_ioapic,
 +eoi_inject.work);
 +   spin_lock(ioapic-lock);
 +   for (i = 0; i  IOAPIC_NUM_PINS; i++) {
 +   union kvm_ioapic_redirect_entry *ent = ioapic-redirtbl[i];
 +
 +   if (ent-fields.trig_mode != IOAPIC_LEVEL_TRIG)
 +   continue;
 +
 +   if (ioapic-irr  (1  i)  !ent-fields.remote_irr)
 +   ret = ioapic_service(ioapic, i);
 +   }
 +   spin_unlock(ioapic-lock);
 +}
 +
  static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector,
  int trigger_mode)
  {
  at  at  -249,8 +267,29  at  at  static void 
  __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector,
 
 ASSERT(ent-fields.trig_mode == IOAPIC_LEVEL_TRIG);
 ent-fields.remote_irr = 0;
 -   if (!ent-fields.mask  (ioapic-irr  (1  i)))
 -   ioapic_service(ioapic, i);
 +   if (!ent-fields.mask  (ioapic-irr  (1  i))) {
 +   ++ioapic-irq_eoi;
 -+   ++ioapic-irq_eoi;
 ++   ++ioapic-irq_eoi[i];
 +   if (ioapic-irq_eoi == 100) {
 -+   if (ioapic-irq_eoi == 100) {
 ++   if (ioapic-irq_eoi[i] == 100) {
 +   /*
 +* Real hardware does not deliver the irq so
 +* immediately during eoi broadcast, so we need
 +* to emulate this behavior. Otherwise, for
 +* guests who has not registered handler of a
 +* level irq, this irq would be injected
 +* immediately after guest enables interrupt
 +* (which happens usually at the end of the
 +* common interrupt routine). This would lead
 +* guest can't move forward and may miss the
 +* possibility to get proper irq handler
 +* registered. So we need to give some breath to
 +* guest. TODO: 1 is too long?
 +*/
 +   schedule_delayed_work(ioapic-eoi_inject, 1);
 +   ioapic-irq_eoi = 0;
 -+   ioapic-irq_eoi = 0;
 ++   ioapic-irq_eoi[i] = 0;
 +   } else {
 +   ioapic_service(ioapic, i);
 +   }
 +   }
 ++   else {
 ++   ioapic-irq_eoi[i] = 0;
 ++   }
 }
  }
 I think ioapic-irq_eoi is prone to reach to 100, because during a eoi 
 broadcast, 
 it's possible that another interrupt's (not current eoi's corresponding 
 interrupt) irr is set, so the ioapic-irq_eoi will grow continually,
 and not too long, ioapic-irq_eoi will reach to 100.
 I want to add u32 irq_eoi[IOAPIC_NUM_PINS]; instead of u32 irq_eoi;.
 Any ideas?
 
 Zhang Haoyu

I'm a bit concerned how this will affect realtime guests.
Worth adding a flag 

Re: [Qemu-devel] [question] e1000 interrupt stormhappenedbecauseofits correspondingioapic-irr bit always set

2014-09-03 Thread Jason Wang
On 09/04/2014 09:56 AM, Zhang Haoyu wrote:
 Hi Jason,
  I tested below patch, it's okay, the e1000 interrupt storm disappeared.
  But I am going to make a bit change on it, could you help review it?
  
  Currently, we call ioapic_service() immediately when we find the irq 
  is still
  active during eoi broadcast. But for real hardware, there's some dealy 
  between
  the EOI writing and irq delivery (system bus latency?). So we need to 
  emulate
  this behavior. Otherwise, for a guest who haven't register a proper 
  irq handler
  , it would stay in the interrupt routine as this irq would be 
  re-injected
  immediately after guest enables interrupt. This would lead guest can't 
  move
  forward and may miss the possibility to get proper irq handler 
  registered (one
  example is windows guest resuming from hibernation).
  
  As there's no way to differ the unhandled irq from new raised ones, 
  this patch
  solve this problems by scheduling a delayed work when the count of irq 
  injected
  during eoi broadcast exceeds a threshold value. After this patch, the 
  guest can
  move a little forward when there's no suitable irq handler in case it 
  may
  register one very soon and for guest who has a bad irq detection 
  routine ( such
  as note_interrupt() in linux ), this bad irq would be recognized soon 
  as in the
  past.
  
  Signed-off-by: Jason Wang jasowang at redhat.com
  ---
   virt/kvm/ioapic.c |   47 
   +--
   virt/kvm/ioapic.h |2 ++
   2 files changed, 47 insertions(+), 2 deletions(-)
  
  diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
  index dcaf272..892253e 100644
  --- a/virt/kvm/ioapic.c
  +++ b/virt/kvm/ioapic.c
   at  at  -221,6 +221,24  at  at  int kvm_ioapic_set_irq(struct 
   kvm_ioapic *ioapic, int irq, int level)
 return ret;
   }
  
  +static void kvm_ioapic_eoi_inject_work(struct work_struct *work)
  +{
  +  int i, ret;
  +  struct kvm_ioapic *ioapic = container_of(work, struct 
  kvm_ioapic,
  +   eoi_inject.work);
  +  spin_lock(ioapic-lock);
  +  for (i = 0; i  IOAPIC_NUM_PINS; i++) {
  +  union kvm_ioapic_redirect_entry *ent = 
  ioapic-redirtbl[i];
  +
  +  if (ent-fields.trig_mode != IOAPIC_LEVEL_TRIG)
  +  continue;
  +
  +  if (ioapic-irr  (1  i)  !ent-fields.remote_irr)
  +  ret = ioapic_service(ioapic, i);
  +  }
  +  spin_unlock(ioapic-lock);
  +}
  +
   static void __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int 
   vector,
  int trigger_mode)
   {
   at  at  -249,8 +267,29  at  at  static void 
   __kvm_ioapic_update_eoi(struct kvm_ioapic *ioapic, int vector,
  
 ASSERT(ent-fields.trig_mode == IOAPIC_LEVEL_TRIG);
 ent-fields.remote_irr = 0;
  -  if (!ent-fields.mask  (ioapic-irr  (1  i)))
  -  ioapic_service(ioapic, i);
  +  if (!ent-fields.mask  (ioapic-irr  (1  i))) {
  +  ++ioapic-irq_eoi;
  -+   ++ioapic-irq_eoi;
  ++   ++ioapic-irq_eoi[i];
  +  if (ioapic-irq_eoi == 100) {
  -+   if (ioapic-irq_eoi == 100) {
  ++   if (ioapic-irq_eoi[i] == 100) {
  +  /*
  +   * Real hardware does not deliver the 
  irq so
  +   * immediately during eoi broadcast, so 
  we need
  +   * to emulate this behavior. Otherwise, 
  for
  +   * guests who has not registered 
  handler of a
  +   * level irq, this irq would be injected
  +   * immediately after guest enables 
  interrupt
  +   * (which happens usually at the end of 
  the
  +   * common interrupt routine). This 
  would lead
  +   * guest can't move forward and may 
  miss the
  +   * possibility to get proper irq handler
  +   * registered. So we need to give some 
  breath to
  +   * guest. TODO: 1 is too long?
  +   */
  +  
  schedule_delayed_work(ioapic-eoi_inject, 1);
  +  ioapic-irq_eoi = 0;
  -+   ioapic-irq_eoi = 0;
  ++   ioapic-irq_eoi[i] = 0;
  +  } else {
  +  ioapic_service(ioapic, i);
  +  }
  +  }
  ++   else {
  ++   ioapic-irq_eoi[i] = 0;
  ++   }
 }
   }
  I think ioapic-irq_eoi is prone to reach to 100, because during a eoi 
  broadcast, 
  

Re: [PATCH] powerpc/kvm/cma: Fix panic introduces by signed shift operation

2014-09-03 Thread Paolo Bonzini
Il 02/09/2014 18:13, Laurent Dufour ha scritto:
 fc95ca7284bc54953165cba76c3228bd2cdb9591 introduces a memset in
 kvmppc_alloc_hpt since the general CMA doesn't clear the memory it
 allocates.
 
 However, the size argument passed to memset is computed from a signed value
 and its signed bit is extended by the cast the compiler is doing. This lead
 to extremely large size value when dealing with order value = 31, and
 almost all the memory following the allocated space is cleaned. As a
 consequence, the system is panicing and may even fail spawning the kdump
 kernel.
 
 This fix makes use of an unsigned value for the memset's size argument to
 avoid sign extension. Among this fix, another shift operation which may
 lead to signed extended value too is also fixed.
 
 Cc: Alexey Kardashevskiy a...@ozlabs.ru
 Cc: Paul Mackerras pau...@samba.org
 Cc: Alexander Graf ag...@suse.de
 Cc: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 Cc: Joonsoo Kim iamjoonsoo@lge.com
 Cc: Benjamin Herrenschmidt b...@kernel.crashing.org
 Signed-off-by: Laurent Dufour lduf...@linux.vnet.ibm.com
 ---
  arch/powerpc/kvm/book3s_64_mmu_hv.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c 
 b/arch/powerpc/kvm/book3s_64_mmu_hv.c
 index 72c20bb16d26..79294c4c5015 100644
 --- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
 +++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
 @@ -62,10 +62,10 @@ long kvmppc_alloc_hpt(struct kvm *kvm, u32 *htab_orderp)
   }
  
   kvm-arch.hpt_cma_alloc = 0;
 - page = kvm_alloc_hpt(1  (order - PAGE_SHIFT));
 + page = kvm_alloc_hpt(1ul  (order - PAGE_SHIFT));
   if (page) {
   hpt = (unsigned long)pfn_to_kaddr(page_to_pfn(page));
 - memset((void *)hpt, 0, (1  order));
 + memset((void *)hpt, 0, (1ul  order));
   kvm-arch.hpt_cma_alloc = 1;
   }
  
 

Thanks, applied to kvm/master.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 1/2] KVM: PPC: e500mc: Add support for single threaded vcpus on e6500 core

2014-09-03 Thread Alexander Graf


On 01.09.14 11:01, Mihai Caraman wrote:
 ePAPR represents hardware threads as cpu node properties in device tree.
 So with existing QEMU, hardware threads are simply exposed as vcpus with
 one hardware thread.
 
 The e6500 core shares TLBs between hardware threads. Without tlb write
 conditional instruction, the Linux kernel uses per core mechanisms to
 protect against duplicate TLB entries.
 
 The guest is unable to detect real siblings threads, so it can't use the
 TLB protection mechanism. An alternative solution is to use the hypervisor
 to allocate different lpids to guest's vcpus that runs simultaneous on real
 siblings threads. On systems with two threads per core this patch halves
 the size of the lpid pool that the allocator sees and use two lpids per VM.
 Use even numbers to speedup vcpu lpid computation with consecutive lpids
 per VM: vm1 will use lpids 2 and 3, vm2 lpids 4 and 5, and so on.
 
 Signed-off-by: Mihai Caraman mihai.cara...@freescale.com

Thanks, applied both to kvm-ppc-queue.


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html