Lock warning traceback

2009-09-29 Thread Priit Laes
Hi!

I've been getting some locking-related warning when I fire up a KVM
machine. This has been popping up since a 2.6.30 kernel:

[snip]
[119206.978058] BUG: MAX_LOCK_DEPTH too low!
[119206.978062] turning off the locking correctness validator.
[119206.978067] Pid: 9510, comm: kvm Not tainted 2.6.31 #44
[119206.978071] Call Trace:
[119206.978080]  [8108e4a9] __lock_acquire+0x94c/0x974
[119206.978086]  [8108e59d] lock_acquire+0xcc/0xe9
[119206.978092]  [810e106f] ? mm_take_all_locks+0xd9/0x110
[119206.978099]  [814af652] ? mutex_lock_nested+0x23b/0x24a
[119206.978104]  [810e0fd0] ? mm_take_all_locks+0x3a/0x110
[119206.978109]  [814b05b9] _spin_lock_nest_lock+0x2c/0x3b
[119206.978114]  [810e106f] ? mm_take_all_locks+0xd9/0x110
[119206.978119]  [810e106f] mm_take_all_locks+0xd9/0x110
[119206.978125]  [810efe83] do_mmu_notifier_register
+0xa8/0x16a
[119206.978130]  [810eff60] mmu_notifier_register+0xe/0x10
[119206.978136]  [8100daba] kvm_dev_ioctl+0x136/0x2ed
[119206.978142]  [811061f9] vfs_ioctl+0x1d/0x82
[119206.978147]  [81106730] do_vfs_ioctl+0x45b/0x4a1
[119206.978153]  [811fdb78] ? __up_read+0x9a/0xa2
[119206.978158]  [8108287b] ? up_read+0x26/0x2a
[119206.978163]  [811067b8] sys_ioctl+0x42/0x65
[119206.978169]  [810329eb] system_call_fastpath+0x16/0x1b
[119216.471426] kvm: emulating exchange as write
[/snip]

Although, this does not seem to be critical as my guest op-system seems
to be working fine...

Päikest,
Priit :)
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2: kvm 2/4] Kill the confusing tsc_ref_khz and ref_freq variables.

2009-09-29 Thread Avi Kivity

On 09/29/2009 06:04 AM, Zachary Amsden wrote:

They are globals, not clearly protected by any ordering or locking, and
vulnerable to various startup races.

Instead, for variable TSC machines, register the cpufreq notifier and get
the TSC frequency directly from the cpufreq machinery.  Not only is it
always right, it is also perfectly accurate, as no error prone measurement
is required.

On such machines, when a new CPU online is brought online, it isn't clear what
frequency it will start with, and it may not correspond to the reference, thus
in hardware_enable we clear the cpu_tsc_khz variable to zero and make sure
it is set before running on a VCPU.

  CPUFREQ_TRANSITION_NOTIFIER);
+   for_each_online_cpu(cpu)
+   per_cpu(cpu_tsc_khz, cpu) = cpufreq_get(cpu);
+   } else {
+   for_each_possible_cpu(cpu)
+   per_cpu(cpu_tsc_khz, cpu) = tsc_khz;
+   }
+   for_each_possible_cpu(cpu) {
+   printk(KERN_DEBUG kvm: cpu %d = %ld khz\n,
+   cpu, per_cpu(cpu_tsc_khz, cpu));
}
  }
   


Leftover debug code?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2: kvm 4/4] Fix hotplug of CPUs for KVM.

2009-09-29 Thread Avi Kivity

On 09/29/2009 06:04 AM, Zachary Amsden wrote:

Both VMX and SVM require per-cpu memory allocation, which is done at module
init time, for only online cpus.

Backend was not allocating enough structure for all possible CPUs, so
new CPUs coming online could not be hardware enabled.

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e27b7a9..2cd8bc2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1716,9 +1716,6 @@ static int kvm_cpu_hotplug(struct notifier_block 
*notifier, unsigned long val,
  {
int cpu = (long)v;

-   if (!kvm_usage_count)
-   return NOTIFY_OK;
-
val= ~CPU_TASKS_FROZEN;
switch (val) {
case CPU_DYING:
   


I still don't see how this bit can work.  Maybe if we move the 
notification registration to the point where kvm_usage_count is bumped.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 1/2] test: add wait parameter to on_cpu

2009-09-29 Thread Avi Kivity

On 09/28/2009 01:22 PM, Marcelo Tosatti wrote:

To determine whether to wait for IPI to finish on remote cpu.

Signed-off-by: Marcelo Tosattimtosa...@redhat.com

Index: qemu-kvm/kvm/user/test/lib/x86/smp.c
===
--- qemu-kvm.orig/kvm/user/test/lib/x86/smp.c
+++ qemu-kvm/kvm/user/test/lib/x86/smp.c
@@ -82,7 +82,7 @@ static void setup_smp_id(void *data)
  asm (mov %0, %%gs:0 : : r(apic_id()) : memory);
  }

-void on_cpu(int cpu, void (*function)(void *data), void *data)
+void on_cpu(int cpu, void (*function)(void *data), void *data, int wait)
  {
  spin_lock(ipi_lock);
  if (cpu == smp_id())
@@ -93,9 +93,11 @@ void on_cpu(int cpu, void (*function)(vo
apic_icr_write(APIC_INT_ASSERT | APIC_DEST_PHYSICAL | APIC_DM_FIXED
 | IPI_VECTOR,
 cpu);
-   while (!ipi_done)
-   ;
-   ipi_done = 0;
+   if (wait) {
+   while (!ipi_done)
+   ;
+   ipi_done = 0;
+   }
  }
  spin_unlock(ipi_lock);
  }
@@ -109,6 +111,6 @@ void smp_init(void)

  setup_smp_id(0);
  for (i = 1; i  cpu_count(); ++i)
-on_cpu(i, setup_smp_id, 0);
+on_cpu(i, setup_smp_id, 0, 1);

  }
Index: qemu-kvm/kvm/user/test/lib/x86/smp.h
===
--- qemu-kvm.orig/kvm/user/test/lib/x86/smp.h
+++ qemu-kvm/kvm/user/test/lib/x86/smp.h
@@ -9,7 +9,7 @@ void smp_init(void);

  int cpu_count(void);
  int smp_id(void);
-void on_cpu(int cpu, void (*function)(void *data), void *data);
+void on_cpu(int cpu, void (*function)(void *data), void *data, int wait);
  void spin_lock(struct spinlock *lock);
  void spin_unlock(struct spinlock *lock);

Index: qemu-kvm/kvm/user/test/x86/smptest.c
===
--- qemu-kvm.orig/kvm/user/test/x86/smptest.c
+++ qemu-kvm/kvm/user/test/x86/smptest.c
@@ -20,6 +20,6 @@ int main()
  ncpus = cpu_count();
  printf(found %d cpus\n, ncpus);
  for (i = 0; i  ncpus; ++i)
-   on_cpu(i, ipi_test, (void *)(long)i);
+   on_cpu(i, ipi_test, (void *)(long)i, 1);
  return 0;
  }
Index: qemu-kvm/kvm/user/test/x86/vmexit.c
===
--- qemu-kvm.orig/kvm/user/test/x86/vmexit.c
+++ qemu-kvm/kvm/user/test/x86/vmexit.c
@@ -63,14 +63,14 @@ static void nop(void *junk)

  static void ipi(void)
  {
-   on_cpu(1, nop, 0);
+   on_cpu(1, nop, 0, 1);
  }

  static void ipi_halt(void)
  {
unsigned long long t;

-   on_cpu(1, nop, 0);
+   on_cpu(1, nop, 0, 1);
t = rdtsc() + 2000;
while (rdtsc()  t)
;

   


Boolean parameters are unreadable at the call site.  I'd much prefer a 
new API (on_cpu_async(), with on_cpu_join() to wait for completion).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 2/2] test: vmexit: run parallel tests on each cpu

2009-09-29 Thread Avi Kivity

On 09/28/2009 01:22 PM, Marcelo Tosatti wrote:

So one can measure SMP overhead.

+
+   for (n = cpu_count(); n  0; n--)
+   on_cpu(n-1, do_tests, 0, 0);
   


Should be done inside do_test(), so we can start the measurement on all 
cpus at the same time (right now, if some cpus calibrate earlier, they 
will finish sooner and the others will have an easier time).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Stack corruption on 2.6.31 [with KVM]

2009-09-29 Thread Avi Kivity

On 09/29/2009 12:18 PM, Con Kolivas wrote:

Resending with a couple of extra CCs.

Only extra details I guess were debian stable with kvm that reports itself as:
QEMU PC emulator version 0.9.1 (kvm-72), Copyright (c) 2003-2008 Fabrice
Bellard

Note that I get this stack corruption *every time*, and it must be with
the -no-kvm option.

   


-no-kvm means kvm is not in use, so unlikely to be the cause of the problem.

Please send to qemu-de...@nongnu.org, but first try with qemu 0.11.0.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2868883 ] netkvm.sys stops sending/receiving on Windows Server 2003 VM

2009-09-29 Thread SourceForge.net
Bugs item #2868883, was opened at 2009-09-28 16:27
Message generated for change (Comment added) made by yanv
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2868883group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Mark Weaver (mdw21)
Assigned to: Nobody/Anonymous (nobody)
Summary: netkvm.sys stops sending/receiving on Windows Server 2003 VM

Initial Comment:
This usually happens within an hour or two of starting the interface.  It can 
be cured temporarily by disabling/enabling the adapter within Windows.  I've 
run the Windows interface with log level set to 2 -- when traffic stops it 
still logs outgoing traffic as normal but ParaNdis_ProcessRxPath stops being 
logged.  I suspect this is to do with the traffic content or timing as I cannot 
reproduce this with iperf, but only with external traffic to a website hosted 
on the machine. 

What further steps can I take to debug this issue?

Host details:

2 x dual core xeons:

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 23
model name  : Intel(R) Xeon(R) CPU   E5410  @ 2.33GHz
stepping: 6
cpu MHz : 2327.685
cache size  : 6144 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 4
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm 
constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est 
tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
bogomips: 4655.37
clflush size: 64
cache_alignment : 64
address sizes   : 38 bits physical, 48 bits virtual
power management:

kernel is 2.6.31 from kernel.org, userspace is debian lenny, all 64-bit
qemu is qemu-kvm-0.10.6

Guest details:
Windows Server 2003 32-bit

qemu is started as:
qemu-system-x86_64 \
-boot c \
-drive file=/data/vms/stooge/boot.raw,if=virtio,boot=on,cache=off \
-m 3072 \
-smp 1 \
-vnc 10.80.80.89:2 \
-k en-gb \
-net nic,model=virtio,macaddr=DE:AD:BE:EF:11:29 \
-net tap,ifname=tap0 \
-localtime \
-usb -usbdevice tablet \
-mem-path /hugepages 


--

Comment By: Yan Vugenfirer (yanv)
Date: 2009-09-29 12:34

Message:
1. Could you please attach the log
2. Could you be more specific on the scenario? Are you running some tests
or network application? 
3. You could raise debug level even more to level 6 - that would give the
information about the rings (how much space is left and etc) 
4. In the code you could add debug prints to ParaNdis5_MiniportISR to
check if the driver even receives the interrupt.


Thanks.

--

Comment By: Mark Weaver (mdw21)
Date: 2009-09-28 16:43

Message:
should have said that this was tested both with the binary driver from:

https://sourceforge.net/projects/kvm/files/kvm-guest-drivers-windows/2/kvm-guest-drivers-windows-2.zip/download

and also a self-compiled driver from the kvm git tree

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2868883group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


cpu affinity issue on kvm-88

2009-09-29 Thread Haneef Syed
Hi
   Actually I am using 2.6.21 kernel and inserting kvm-76 module.

   It is fine when I am using cpu affinity( like   taskset 1 
./qemu-system-x86-64  -hda imgage name ).

   I am having quad core intel xeon processor.

  But without cpu affinity It is hanging randomly.(./qemu-system-x86-64 
-hda imgage name)

  Give me any suggestion


__
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Stack corruption on 2.6.31 [with KVM]

2009-09-29 Thread Con Kolivas
Resending with a couple of extra CCs.

Only extra details I guess were debian stable with kvm that reports itself as:
QEMU PC emulator version 0.9.1 (kvm-72), Copyright (c) 2003-2008 Fabrice 
Bellard

Note that I get this stack corruption *every time*, and it must be with 
the -no-kvm option.

Con


On Sun, 27 Sep 2009 08:16:10 you wrote:
 Hi I've had a stack corruption error on 2.6.31 on x86_64 kvm
 with the following options:

 kvm -smp 4 -kernel arch/x86/boot/bzImage -s -append
 console=ttyS0,serial=115200 -nographic -no-kvm

 list *get_random_int+0xa5
 0x812d5e85 is in get_random_int
 (/home/con/kernel/linux-2.6.31/drivers/char/random.c:1678). 1673   
 int ret;
 1674
 1675keyptr = get_keyptr();
 1676hash[0] += current-pid + jiffies + get_cycles();
 1677
 1678ret = half_md4_transform(hash, keyptr-secret);
 1679put_cpu_var(get_random_int_hash);
 1680
 1681return ret;
 1682}

 dmesg follows:
 [0.00] Initializing cgroup subsys cpuset
 [0.00] Linux version 2.6.31 (c...@duo) (gcc version 4.3.2 (Debian
 4.3.2-1.1) ) #132 SMP PREEMPT Sun Sep 27 08:05:33 EST 2009 [0.00]
 Command line: console=ttyS0,serial=115200
 [0.00] KERNEL supported cpus:
 [0.00]   Intel GenuineIntel
 [0.00]   AMD AuthenticAMD
 [0.00]   Centaur CentaurHauls
 [0.00] BIOS-provided physical RAM map:
 [0.00]  BIOS-e820:  - 0009fc00 (usable)
 [0.00]  BIOS-e820: 0009fc00 - 000a (reserved)
 [0.00]  BIOS-e820: 000e8000 - 0010 (reserved)
 [0.00]  BIOS-e820: 0010 - 07ff (usable)
 [0.00]  BIOS-e820: 07ff - 0800 (ACPI data)
 [0.00]  BIOS-e820: fffbd000 - 0001 (reserved)
 [0.00] DMI 2.4 present.
 [0.00] last_pfn = 0x7ff0 max_arch_pfn = 0x4
 [0.00] x86 PAT enabled: cpu 0, old 0x0, new 0x7010600070106
 [0.00] CPU MTRRs all blank - virtualized system.
 [0.00] init_memory_mapping: -07ff
 [0.00] ACPI: RSDP 000fb6c0 00014 (v00 QEMU  )
 [0.00] ACPI: RSDT 07ff 0002C (v01 QEMU   QEMURSDT
 0001 QEMU 0001) [0.00] ACPI: FACP 07ff002c 00074
 (v01 QEMU   QEMUFACP 0001 QEMU 0001) [0.00] ACPI: DSDT
 07ff0100 0253C (v01   BXPC   BXDSDT 0001 INTL 20061109) [   
 0.00] ACPI: FACS 07ff00c0 00040
 [0.00] ACPI: APIC 07ff2640 000E0 (v01 QEMU   QEMUAPIC
 0001 QEMU 0001) [0.00] (6 early reservations) == bootmem
 [00 - 0007ff] [0.00]   #0 [00 - 001000]  
 BIOS data page == [00 - 001000] [0.00]   #1
 [006000 - 008000]   TRAMPOLINE == [006000 - 008000] [ 
   0.00]   #2 [000100 - 0005e3d7f0]TEXT DATA BSS == [000100
 - 0005e3d7f0] [0.00]   #3 [09fc00 - 10]BIOS
 reserved == [09fc00 - 10] [0.00]   #4 [0005e3e000 -
 0005e3e065]  BRK == [0005e3e000 - 0005e3e065] [0.00]  
 #5 [008000 - 009000]  PGTABLE == [008000 - 009000]
 [0.00] found SMP MP-table at [880fb540] fb540
 [0.00] Zone PFN ranges:
 [0.00]   DMA  0x - 0x1000
 [0.00]   DMA320x1000 - 0x0010
 [0.00]   Normal   0x0010 - 0x0010
 [0.00] Movable zone start PFN for each node
 [0.00] early_node_map[2] active PFN ranges
 [0.00] 0: 0x - 0x009f
 [0.00] 0: 0x0100 - 0x7ff0
 [0.00] ACPI: PM-Timer IO Port: 0xb008
 [0.00] ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
 [0.00] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled)
 [0.00] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)
 [0.00] ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] enabled)
 [0.00] ACPI: LAPIC (acpi_id[0x04] lapic_id[0x04] disabled)
 [0.00] ACPI: LAPIC (acpi_id[0x05] lapic_id[0x05] disabled)
 [0.00] ACPI: LAPIC (acpi_id[0x06] lapic_id[0x06] disabled)
 [0.00] ACPI: LAPIC (acpi_id[0x07] lapic_id[0x07] disabled)
 [0.00] ACPI: LAPIC (acpi_id[0x08] lapic_id[0x08] disabled)
 [0.00] ACPI: LAPIC (acpi_id[0x09] lapic_id[0x09] disabled)
 [0.00] ACPI: LAPIC (acpi_id[0x0a] lapic_id[0x0a] disabled)
 [0.00] ACPI: LAPIC (acpi_id[0x0b] lapic_id[0x0b] disabled)
 [0.00] ACPI: LAPIC (acpi_id[0x0c] lapic_id[0x0c] disabled)
 [0.00] ACPI: LAPIC (acpi_id[0x0d] lapic_id[0x0d] disabled)
 [0.00] ACPI: LAPIC (acpi_id[0x0e] lapic_id[0x0e] disabled)
 [0.00] ACPI: LAPIC (acpi_id[0x0f] lapic_id[0x0f] disabled)
 [0.00] ACPI: IOAPIC (id[0x04] address[0xfec0] gsi_base[0])
 [0.00] IOAPIC[0]: apic_id 4, version 17, 

Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

2009-09-29 Thread Zhai, Edwin

Avi,
Any comments for this new patch?
Thanks,


Zhai, Edwin wrote:

Avi Kivity wrote:
  

+#define KVM_VMX_DEFAULT_PLE_GAP41
+#define KVM_VMX_DEFAULT_PLE_WINDOW 4096
+static int __read_mostly ple_gap = KVM_VMX_DEFAULT_PLE_GAP;
+module_param(ple_gap, int, S_IRUGO);
+
+static int __read_mostly ple_window = KVM_VMX_DEFAULT_PLE_WINDOW;
+module_param(ple_window, int, S_IRUGO);
   
  

Shouldn't be __read_mostly since they're read very rarely (__read_mostly 
should be for variables that are very often read, and rarely written).
  



In general, they are read only except that experienced user may try 
different parameter for perf tuning.


  

I'm not even sure they should be parameters.
  



For different spinlock in different OS, and for different workloads, we 
need different parameter for tuning. It's similar as the enable_ept.


  
  


  /*
+ * Indicate a busy-waiting vcpu in spinlock. We do not enable the PAUSE
+ * exiting, so only get here on cpu with PAUSE-Loop-Exiting.
+ */
+static int handle_pause(struct kvm_vcpu *vcpu,
+   struct kvm_run *kvm_run)
+{
+   ktime_t expires;
+   skip_emulated_instruction(vcpu);
+
+   /* Sleep for 1 msec, and hope lock-holder got scheduled */
+   expires = ktime_add_ns(ktime_get(), 100UL);
   

  
I think this should be much lower, 50-100us.  Maybe this should be a 
parameter.  With 1ms we losing significant cpu time if the congestion 
clears.
  



I have made it a parameter with default value of 100 us.

  
  


+   set_current_state(TASK_INTERRUPTIBLE);
+   schedule_hrtimeout(expires, HRTIMER_MODE_ABS);
+
   

  
Please add a tracepoint for this (since it can cause significant change 
in behaviour), 



Isn't trace_kvm_exit(exit_reason, ...) enough? We can tell the PLE 
vmexit from other vmexits.


  
and move the logic to kvm_main.c.  It will be reused by 
the AMD implementation, possibly my software spinlock detector, 
paravirtualized spinlocks, and hopefully other architectures.
  



Done.
  
  


+   return 1;
+}
+
+/*
   

  
  


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2869748 ] KVM-88 broke VirtIO Hard Disks

2009-09-29 Thread SourceForge.net
Bugs item #2869748, was opened at 2009-09-29 14:47
Message generated for change (Tracker Item Submitted) made by dietmarmaurer
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2869748group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: qemu
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Dietmar Maurer (dietmarmaurer)
Assigned to: Nobody/Anonymous (nobody)
Summary: KVM-88 broke VirtIO Hard Disks

Initial Comment:
As already mentione on the list, VirtIO is still broken in qemu-kvm-0-11.0. 
Reverting the patch solves the problem.

   It turned out to be a Qemu merge into KVM userspace:
   kvm-87-119-ga8b7f95 (commit
  a8b7f959d1fd97c4ccaf08ce750020ecd08b4c88)
  
   Can you look into it?
 
  Not sure if you familiar with this, but anyway:
 
  $ git diff-tree -p bf011293f  | patch -R -p1


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2869748group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2868883 ] netkvm.sys stops sending/receiving on Windows Server 2003 VM

2009-09-29 Thread SourceForge.net
Bugs item #2868883, was opened at 2009-09-28 16:27
Message generated for change (Comment added) made by yanv
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2868883group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Mark Weaver (mdw21)
Assigned to: Nobody/Anonymous (nobody)
Summary: netkvm.sys stops sending/receiving on Windows Server 2003 VM

Initial Comment:
This usually happens within an hour or two of starting the interface.  It can 
be cured temporarily by disabling/enabling the adapter within Windows.  I've 
run the Windows interface with log level set to 2 -- when traffic stops it 
still logs outgoing traffic as normal but ParaNdis_ProcessRxPath stops being 
logged.  I suspect this is to do with the traffic content or timing as I cannot 
reproduce this with iperf, but only with external traffic to a website hosted 
on the machine. 

What further steps can I take to debug this issue?

Host details:

2 x dual core xeons:

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 23
model name  : Intel(R) Xeon(R) CPU   E5410  @ 2.33GHz
stepping: 6
cpu MHz : 2327.685
cache size  : 6144 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 4
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm 
constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est 
tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
bogomips: 4655.37
clflush size: 64
cache_alignment : 64
address sizes   : 38 bits physical, 48 bits virtual
power management:

kernel is 2.6.31 from kernel.org, userspace is debian lenny, all 64-bit
qemu is qemu-kvm-0.10.6

Guest details:
Windows Server 2003 32-bit

qemu is started as:
qemu-system-x86_64 \
-boot c \
-drive file=/data/vms/stooge/boot.raw,if=virtio,boot=on,cache=off \
-m 3072 \
-smp 1 \
-vnc 10.80.80.89:2 \
-k en-gb \
-net nic,model=virtio,macaddr=DE:AD:BE:EF:11:29 \
-net tap,ifname=tap0 \
-localtime \
-usb -usbdevice tablet \
-mem-path /hugepages 


--

Comment By: Yan Vugenfirer (yanv)
Date: 2009-09-29 14:51

Message:
Another thing to test - could you please run the guest without /hugepages
option.



--

Comment By: Yan Vugenfirer (yanv)
Date: 2009-09-29 12:34

Message:
1. Could you please attach the log
2. Could you be more specific on the scenario? Are you running some tests
or network application? 
3. You could raise debug level even more to level 6 - that would give the
information about the rings (how much space is left and etc) 
4. In the code you could add debug prints to ParaNdis5_MiniportISR to
check if the driver even receives the interrupt.


Thanks.

--

Comment By: Mark Weaver (mdw21)
Date: 2009-09-28 16:43

Message:
should have said that this was tested both with the binary driver from:

https://sourceforge.net/projects/kvm/files/kvm-guest-drivers-windows/2/kvm-guest-drivers-windows-2.zip/download

and also a self-compiled driver from the kvm git tree

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2868883group_id=180599
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


migrate_set_downtime bug

2009-09-29 Thread Dietmar Maurer
using 0.11.0, live migration works as expected, but max downtime does not seem 
to work, for example:

# migrate_set_downtime 1

After that tcp migration has much longer downtimes (up to 20 seconds).

Also, it seems that the 'monitor' is locked (take up to 10 seconds until I get 
a monitor prompt).

Someone else get this behavior?

- Dietmar

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kvm guest: hrtimer: interrupt too slow

2009-09-29 Thread Michael Tokarev

Hello.

I'm having quite an.. unusable system here.
It's not really a regresssion with 0.11.0,
it was something similar before, but with
0.11.0 and/or 2.6.31 it become much worse.

The thing is that after some uptime, kvm
guest prints something like this:

hrtimer: interrupt too slow, forcing clock min delta to 461487495 ns

after which system (guest) speeed becomes
very slow.  The above message is from
2.6.31 guest running wiht 0.11.0  2.6.31
host.  Before I tried it with 0.10.6 and
2.6.30 or 2.6.27, and the delta were a
bit less than that:

hrtimer: interrupt too slow, forcing clock min delta to 15415 ns
hrtimer: interrupt too slow, forcing clock min delta to 93629025 ns

etc.

Before, guest was just somewhat slow.  But
now, it reached the state when it's almost
unusable.

Especially nice it's visible in xterm
(running on that guest): typing a single
char in xterm results in it being displayed
after 1..2 sec pause.  With applications
like web browser (where all the font
rendering is done on the client and X
server receives bitmap instead of a
character code to draw, it isn't that
bad.

I suspect it's only happening with small
network packets.

The above message (with large min delta)
occured after about 20 hours uptime.  Similar
(give or take) delays were observed previously
as well.  That to say, -- it's not something
you notice immediately.

The problem is quite consistent, that is,
different guests shows it sooner or later.

Right now I'm running 2.6.31-amd64 host,
32bit qemu-kvm-0.11.0, 2.6.31-i686 uniprocessor
guest, virtio networking and block devices,
kvm-clock is enabled.  Similar configuration
were used before (with different versions).
The hardware is Phenom 9750 (4core) running
on Amd780g/sb700 chipset.

Any hints on what to do with all this?

Thanks!

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

2009-09-29 Thread Avi Kivity

On 09/28/2009 11:33 AM, Zhai, Edwin wrote:


Avi Kivity wrote:

+#define KVM_VMX_DEFAULT_PLE_GAP41
+#define KVM_VMX_DEFAULT_PLE_WINDOW 4096
+static int __read_mostly ple_gap = KVM_VMX_DEFAULT_PLE_GAP;
+module_param(ple_gap, int, S_IRUGO);
+
+static int __read_mostly ple_window = KVM_VMX_DEFAULT_PLE_WINDOW;
+module_param(ple_window, int, S_IRUGO);

Shouldn't be __read_mostly since they're read very rarely 
(__read_mostly should be for variables that are very often read, and 
rarely written).


In general, they are read only except that experienced user may try 
different parameter for perf tuning.



__read_mostly doesn't just mean it's read mostly.  It also means it's 
read often.  Otherwise it's just wasting space in hot cachelines.





I'm not even sure they should be parameters.


For different spinlock in different OS, and for different workloads, 
we need different parameter for tuning. It's similar as the enable_ept.


No, global parameters don't work for tuning workloads and guests since 
they cannot be modified on a per-guest basis.  enable_ept is only useful 
for debugging and testing.





+set_current_state(TASK_INTERRUPTIBLE);
+schedule_hrtimeout(expires, HRTIMER_MODE_ABS);
+


Please add a tracepoint for this (since it can cause significant 
change in behaviour), 


Isn't trace_kvm_exit(exit_reason, ...) enough? We can tell the PLE 
vmexit from other vmexits.


Right.  I thought of the software spinlock detector, but that's another 
problem.


I think you can drop the sleep_time parameter, it can be part of the 
function.  Also kvm_vcpu_sleep() is confusing, we also sleep on halt.  
Please call it kvm_vcpu_on_spin() or something (since that's what the 
guest is doing).


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm guest: hrtimer: interrupt too slow

2009-09-29 Thread Michael Tokarev

Avi Kivity wrote:

On 09/29/2009 03:12 PM, Michael Tokarev wrote:

[]

The thing is that after some uptime, kvm
guest prints something like this:

hrtimer: interrupt too slow, forcing clock min delta to 461487495 ns

[]

What happens if you use hpet or pmtimer as guest clocksource?


For all the guests I have handy, only 2 clocksources
are available: kvm-clock and acpi_pm.  The host itself
has hpet turned off because it itself had issues with
hpet, and after many tries we finally turned it off
(there were several long and painy threads on lkml
about this).

Dunno why pmtimer isn't available.

I tried switching to acpi_pm on a running guest (which
is still runnining in very slow mode), but that did not
make any difference - i.e., it did not become fast
again.


Please post host /proc/cpuinfo.


Here's the cpuinfo from host (for last core):

processor   : 3
vendor_id   : AuthenticAMD
cpu family  : 16
model   : 2
model name  : AMD Phenom(tm) 9750 Quad-Core Processor
stepping: 3
cpu MHz : 1200.000
cache size  : 512 KB
physical id : 0
siblings: 4
core id : 3
cpu cores   : 4
apicid  : 3
initial apicid  : 3
fpu : yes
fpu_exception   : yes
cpuid level : 5
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb 
rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni 
monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a 
misalignsse 3dnowprefetch osvw ibs
bogomips: 4814.78
TLB size: 1024 4K pages
clflush size: 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

It has cpufreq enabled (ondemand), but turning that off
does not change anything (that was the first thing I
tried, but AFTER the guest become slow).

Thanks!

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[ kvm-Bugs-2868883 ] netkvm.sys stops sending/receiving on Windows Server 2003 VM

2009-09-29 Thread SourceForge.net
Bugs item #2868883, was opened at 2009-09-28 15:27
Message generated for change (Comment added) made by mdw21
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detailatid=893831aid=2868883group_id=180599

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Mark Weaver (mdw21)
Assigned to: Nobody/Anonymous (nobody)
Summary: netkvm.sys stops sending/receiving on Windows Server 2003 VM

Initial Comment:
This usually happens within an hour or two of starting the interface.  It can 
be cured temporarily by disabling/enabling the adapter within Windows.  I've 
run the Windows interface with log level set to 2 -- when traffic stops it 
still logs outgoing traffic as normal but ParaNdis_ProcessRxPath stops being 
logged.  I suspect this is to do with the traffic content or timing as I cannot 
reproduce this with iperf, but only with external traffic to a website hosted 
on the machine. 

What further steps can I take to debug this issue?

Host details:

2 x dual core xeons:

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 23
model name  : Intel(R) Xeon(R) CPU   E5410  @ 2.33GHz
stepping: 6
cpu MHz : 2327.685
cache size  : 6144 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 4
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm 
constant_tsc arch_perfmon pebs bts rep_good pni dtes64 monitor ds_cpl vmx est 
tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm tpr_shadow vnmi flexpriority
bogomips: 4655.37
clflush size: 64
cache_alignment : 64
address sizes   : 38 bits physical, 48 bits virtual
power management:

kernel is 2.6.31 from kernel.org, userspace is debian lenny, all 64-bit
qemu is qemu-kvm-0.10.6

Guest details:
Windows Server 2003 32-bit

qemu is started as:
qemu-system-x86_64 \
-boot c \
-drive file=/data/vms/stooge/boot.raw,if=virtio,boot=on,cache=off \
-m 3072 \
-smp 1 \
-vnc 10.80.80.89:2 \
-k en-gb \
-net nic,model=virtio,macaddr=DE:AD:BE:EF:11:29 \
-net tap,ifname=tap0 \
-localtime \
-usb -usbdevice tablet \
-mem-path /hugepages 


--

Comment By: Mark Weaver (mdw21)
Date: 2009-09-29 15:00

Message:
 1. Could you please attach the log

Too big for sf.net, I have put a log here:

http://www.blushingpenguin.com/kvm/netkvm.log.bz2

During that log, it appeared that outgoing packets were still being
transmitted, 
however incoming packets were not being received.  This was verified by
running
ping on the guest and using tcpdump on the host.  After a while packets
started
begin received again.  The pattern can be seen with:

grep received netkvm.log  foo

up to 2235.26074219 packets are being pulled out regularly -- generally
1-2 packets
at a time.  After that they start begin pulled out irregularly and in
greater numbers.
after 2870.28808594 normal service is resumed.

 2. Could you be more specific on the scenario? Are you running some
tests
 or network application?

It's running websites under IIS.  I tried to reproduce this issue with
various
iperf scenarios but failed to do so.

 3. You could raise debug level even more to level 6 - that would give
the
 information about the rings (how much space is left and etc)

I have raised the level to 7 (the level of the log linked to above).  

 4. In the code you could add debug prints to ParaNdis5_MiniportISR to
 check if the driver even receives the interrupt.

It appears that DEBUG_EXIT_STATUS(7, (ULONG)b); is in the function 
ParaNdis5_MiniportISR so I assume this is sufficient.

 (5). Another thing to test - could you please run the guest without
/hugepages
option.

The same issue occurs without hugepages.


--

Comment By: Yan Vugenfirer (yanv)
Date: 2009-09-29 13:51

Message:
Another thing to test - could you please run the guest without /hugepages
option.



--

Comment By: Yan Vugenfirer (yanv)
Date: 2009-09-29 11:34

Message:
1. Could you please attach the log
2. Could you be more specific on the scenario? Are you running some tests
or network application? 
3. You could raise debug level even more to level 6 - that would give the
information about the rings (how much space is left and etc) 
4. In the code you could add debug prints to ParaNdis5_MiniportISR to
check if the driver even receives the interrupt.


Thanks.


[PATCH] Updating cerberus test suite module to use CTCS2

2009-09-29 Thread Lucas Meneghel Rodrigues
Instead of using the unmaintained original CTCS suite,
use the CTCS2 project. Since it's not necessary to keep
the old version around, just bump the source version and
add patches to fix the build under 64 bit architectures

New features of the test:

 * Using a newer version compared to the existed cerberus test.
 * User can specifiy cerberus testcases by providing command line options in
 control file.
 * Added a patch to fix the makefile to make cerberus run on
 x86_64 system.

Notes from Lucas

 * I removed the binary diffs for brevity, this is just for the sake of
 documentation
 * I was thinking about a good default time to let this test run, and I am
 tentatively setting the test to run for 1 hour and setting the timeout to
 1 hour and 5 minutes, which seems long enough to get meaningful results out
 of it. If we come to the conclusion that this is too long I am going to change
 it at a later time. Obviously that's configurable on the test config file.

Signed-off-by: Cao, Chen k...@redhat.com
---
 client/tests/cerberus/0001-fix-ctcs2-build.patch   |  174 +++
 client/tests/cerberus/0002-compile-on-64bit.patch  |  175 
 client/tests/cerberus/cerberus.py  |   57 ---
 client/tests/cerberus/control  |   14 +-
 client/tests/cerberus/ctcs-1.3.1pre1.tar.bz2   |  Bin 115392 - 0 bytes
 client/tests/cerberus/ctcs2.tar.bz2|  Bin 0 - 2131977 bytes
 client/tests/cerberus/fix-ctcs-build.patch |   87 --
 client/tests/kvm/autotest_control/cerberus.control |   20 +++
 client/tests/kvm/kvm_tests.cfg.sample  |3 +
 9 files changed, 415 insertions(+), 115 deletions(-)
 create mode 100644 client/tests/cerberus/0001-fix-ctcs2-build.patch
 create mode 100644 client/tests/cerberus/0002-compile-on-64bit.patch
 delete mode 100644 client/tests/cerberus/ctcs-1.3.1pre1.tar.bz2
 create mode 100644 client/tests/cerberus/ctcs2.tar.bz2
 delete mode 100644 client/tests/cerberus/fix-ctcs-build.patch
 create mode 100644 client/tests/kvm/autotest_control/cerberus.control

diff --git a/client/tests/cerberus/0001-fix-ctcs2-build.patch 
b/client/tests/cerberus/0001-fix-ctcs2-build.patch
new file mode 100644
index 000..acf1566
--- /dev/null
+++ b/client/tests/cerberus/0001-fix-ctcs2-build.patch
@@ -0,0 +1,174 @@
+diff --git a/runin/src/chartst.c b/runin/src/chartst.c
+index 4a20b38..63b1a5a 100644
+--- a/runin/src/chartst.c
 b/runin/src/chartst.c
+@@ -9,6 +9,7 @@
+ #include unistd.h
+ #include stdlib.h
+ #include signal.h
++#include string.h
+ 
+ void handler(int i) {
+   exit (0);
+diff --git a/runin/src/memtst.src/maxalloc.c b/runin/src/memtst.src/maxalloc.c
+index 5c48356..4863791 100755
+--- a/runin/src/memtst.src/maxalloc.c
 b/runin/src/memtst.src/maxalloc.c
+@@ -10,9 +10,6 @@
+ 
+ #if defined(__BSD__)
+   static const size_t PAGE_SIZE = 4096;
+-#else
+-/* this is horribly architecture specific */
+-  #include asm/page.h
+ #endif
+ 
+ 
+diff --git a/runin/src/memtst.src/memtst.c b/runin/src/memtst.src/memtst.c
+index f086f28..538f770 100755
+--- a/runin/src/memtst.src/memtst.c
 b/runin/src/memtst.src/memtst.c
+@@ -10,8 +10,6 @@
+ 
+ #if defined(__BSD__)
+   static const size_t PAGE_SIZE = 4096;
+-#else
+-  #include asm/page.h
+ #endif
+ 
+ /* The verbose global from memtst_main.c */
+@@ -331,6 +329,12 @@ void kmemscan (int *nbuf, int block_size, int offset) {
+   int kmem_file;
+   int d;
+ 
++  /* Newer linux distributions don't have asm/page.h therefore
++   * we are going to get the page size using the value of
++   * _SC_PAGESIZE instead.
++   */
++  u_long page_size = sysconf(_SC_PAGESIZE);
++
+   /* window manipulation, iterator, read retval, etc */
+   int low, high, foo;
+   int rd;
+@@ -353,7 +357,7 @@ void kmemscan (int *nbuf, int block_size, int offset) {
+ 
+   /* Now compute the offset (in chars) of the error from the page
+  boundary. */
+-  fail_page_offset = ((int) (nbuf[offset])) % PAGE_SIZE;
++  fail_page_offset = ((int) (nbuf[offset])) % page_size;
+ 
+   kmem_file = open(/proc/kcore,0);
+   if (kmem_file  0) {
+@@ -370,7 +374,7 @@ void kmemscan (int *nbuf, int block_size, int offset) {
+* window.
+*/
+   fail_page_offset -= ((offset - low) * sizeof(int));
+-  if (fail_page_offset  0) fail_page_offset+=PAGE_SIZE;
++  if (fail_page_offset  0) fail_page_offset+=page_size;
+ 
+   printf(%d %x fail_page_offset\n,fail_page_offset,fail_page_offset);
+ 
+@@ -382,8 +386,8 @@ void kmemscan (int *nbuf, int block_size, int offset) {
+*/ #include sys/types.h
+  #include sys/sysctl.h
+ 
+-  lseek(kmem_file,pages*PAGE_SIZE+fail_page_offset,SEEK_SET);
+-  phys_addr=pages*PAGE_SIZE+fail_page_offset;
++  lseek(kmem_file,pages*page_size+fail_page_offset,SEEK_SET);
++  phys_addr=pages*page_size+fail_page_offset;
+ 
+   /* We now use 

RE: migrate_set_downtime bug

2009-09-29 Thread Dietmar Maurer
Seems the bwidth calculation is the problem. The code simply does:

bwidth = (bytes_transferred - bytes_transferred_last) / timediff

but I assume network traffic is buffered, so calculated bwidth is sometimes 
much too high. 

- Dietmar

 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On
 Behalf Of Dietmar Maurer
 Sent: Dienstag, 29. September 2009 15:01
 To: kvm
 Subject: migrate_set_downtime bug
 
 using 0.11.0, live migration works as expected, but max downtime does
 not seem to work, for example:
 
 # migrate_set_downtime 1
 
 After that tcp migration has much longer downtimes (up to 20 seconds).
 
 Also, it seems that the 'monitor' is locked (take up to 10 seconds
 until I get a monitor prompt).
 
 Someone else get this behavior?
 
 - Dietmar
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: migrate_set_downtime bug

2009-09-29 Thread Dietmar Maurer
this patch solves the problem by calculation an average bandwidth.

- Dietmar

 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On
 Behalf Of Dietmar Maurer
 Sent: Dienstag, 29. September 2009 16:37
 To: kvm
 Subject: RE: migrate_set_downtime bug
 
 Seems the bwidth calculation is the problem. The code simply does:
 
 bwidth = (bytes_transferred - bytes_transferred_last) / timediff
 
 but I assume network traffic is buffered, so calculated bwidth is
 sometimes much too high.
 


migrate.diff
Description: migrate.diff


Re: migrate_set_downtime bug

2009-09-29 Thread Anthony Liguori

Dietmar Maurer wrote:

this patch solves the problem by calculation an average bandwidth.
  


Can you take a look Glauber?

Regards,

Anthony Liguori


- Dietmar

  

-Original Message-
From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On
Behalf Of Dietmar Maurer
Sent: Dienstag, 29. September 2009 16:37
To: kvm
Subject: RE: migrate_set_downtime bug

Seems the bwidth calculation is the problem. The code simply does:

bwidth = (bytes_transferred - bytes_transferred_last) / timediff

but I assume network traffic is buffered, so calculated bwidth is
sometimes much too high.




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [KVM-AUTOTEST PATCH 1/2] Add KSM test

2009-09-29 Thread Lucas Meneghel Rodrigues
On Fri, 2009-09-25 at 05:22 -0400, Jiri Zupka wrote:
 - Dor Laor dl...@redhat.com wrote:
 
  On 09/16/2009 04:09 PM, Jiri Zupka wrote:
  
   - Dor Laordl...@redhat.com  wrote:
  
   On 09/15/2009 09:58 PM, Jiri Zupka wrote:
   After a quick review I have the following questions:
   1. Why did you implement the guest tool in 'c' and not in
  python?
   Python is much simpler and you can share some code with the
   server.
   This 'test protocol' would also be easier to understand this
   way.
  
   We need speed and the precise control of allocate memory in
  pages.
  
   2. IMHO there is no need to use select, you can do blocking
  read.
  
   We replace socket communication by interactive program
  communication
   via ssh/telnet
  
   3. Also you can use plain malloc without the more complex ( a
  bit)
   mmap.
  
   We need address exactly the memory pages. We can't allow shift of
   the data in memory.
  
   You can use the tmpfs+dd idea instead of the specific program as I
   detailed before. Maybe some other binary can be used. My intention
  is
   to
   simplify the test/environment as much as possible.
  
  
   We need compatibility with others system, like Windows etc..
   We want to add support for others system in next version
  
  KSM is a host feature and should be agnostic to the guest.
  Also I don't think your code will compile on windows...
 
 Yes, I think you have true. 

First of all, sorry, I am doing the best I can to review carefully all
the patch queue, and as KSM is a more involved feature that I am not
very familiar with, I need a bit more time to review it!

 But because we need generate special data to pages in memory. 
 We need use script on guest side of test. Because communication 
 over ssh is to slow to transfer lot of GB of special data to guests.
 
 We can use optimized C program which is 10x and more faster than 
 python script on native system. Heavy load of virtual guest can 
 make some performance problem.

About code compiling under windows, I guess making a native windows c or
c++ program is an option, I generally agree with your reasoning, this
case seems to be better covered with a c program. Will get into it in
more detail ASAP...

 We can use tmpfs but with python script to generate special data. 
 We can't use dd with random because we need test some special case.
 (change only last 96B of page etc.. )
 
 
 What do you think about it? 
 
  
  
  
   --
   To unsubscribe from this list: send the line unsubscribe kvm in
   the body of a message to majord...@vger.kernel.org
   More majordomo info at 
  http://vger.kernel.org/majordomo-info.html
  
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2: kvm 4/4] Fix hotplug of CPUs for KVM.

2009-09-29 Thread Zachary Amsden

On 09/28/2009 10:30 PM, Avi Kivity wrote:

On 09/29/2009 06:04 AM, Zachary Amsden wrote:
Both VMX and SVM require per-cpu memory allocation, which is done at 
module

init time, for only online cpus.

Backend was not allocating enough structure for all possible CPUs, so
new CPUs coming online could not be hardware enabled.

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e27b7a9..2cd8bc2 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1716,9 +1716,6 @@ static int kvm_cpu_hotplug(struct 
notifier_block *notifier, unsigned long val,

  {
  int cpu = (long)v;

-if (!kvm_usage_count)
-return NOTIFY_OK;
-
  val= ~CPU_TASKS_FROZEN;
  switch (val) {
  case CPU_DYING:


I still don't see how this bit can work.  Maybe if we move the 
notification registration to the point where kvm_usage_count is bumped.


That was stray junk in the patch.  Let me rediff...

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: migrate_set_downtime bug

2009-09-29 Thread Glauber Costa
On Tue, Sep 29, 2009 at 10:39:57AM -0500, Anthony Liguori wrote:
 Dietmar Maurer wrote:
 this patch solves the problem by calculation an average bandwidth.
   

 Can you take a look Glauber?

 Regards,

 Anthony Liguori

 - Dietmar

   
 -Original Message-
 From: kvm-ow...@vger.kernel.org [mailto:kvm-ow...@vger.kernel.org] On
 Behalf Of Dietmar Maurer
 Sent: Dienstag, 29. September 2009 16:37
 To: kvm
 Subject: RE: migrate_set_downtime bug

 Seems the bwidth calculation is the problem. The code simply does:

 bwidth = (bytes_transferred - bytes_transferred_last) / timediff

 but I assume network traffic is buffered, so calculated bwidth is
 sometimes much too high.
On the other hand, you are just calculating the total since the beginning of
migration, which is not right either.

Also, if this is really the case (buffered), then the bandwidth capping part
of migration is also wrong.

Have you compared the reported bandwidth to your actual bandwith ? I suspect
the source of the problem can be that we're currently ignoring the time we take
to transfer the state of the devices, and maybe it is not negligible.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: migrate_set_downtime bug

2009-09-29 Thread Dietmar Maurer
 Also, if this is really the case (buffered), then the bandwidth capping
 part
 of migration is also wrong.
 
 Have you compared the reported bandwidth to your actual bandwith ? I
 suspect
 the source of the problem can be that we're currently ignoring the time
 we take
 to transfer the state of the devices, and maybe it is not negligible.
 

I have a 1GB network (e1000 card), and get values like bwidth=0.98 - which is 
much too high.

- Dietmar

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

2009-09-29 Thread Avi Kivity

On 09/27/2009 04:53 PM, Joerg Roedel wrote:


   

Depends.  If it's a global yield(), yes.  If it's a local yield() that
doesn't rebalance the runqueues we might be left with the spinning task
re-running.
 

Only one runable task on each cpu is unlikely in a situation of high
vcpu overcommit (where pause filtering matters).

   


I think even 2:1 overcommit can degrade performance terribly.


Also, if yield means give up the reminder of our timeslice, then we
potentially end up sleeping a much longer random amount of time.  If we
yield to another vcpu in the same guest we might not care, but if we
yield to some other guest we're seriously penalizing ourselves.
 

I agree that a directed yield with possible rebalance would be good to
have, but this is very intrusive to the scheduler code and I think we
should at least try if this simpler approach already gives us good
results.
   


No objection to trying.  I'd like to see hrtimer sleep as a baseline 
since it doesn't require any core changes, and we can play with it as we 
add more core infrastructure:


- not sleeping if all vcpus are running
- true yield() instead of sleep
- directed yield
- cross cpu directed yield

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Lock warning traceback

2009-09-29 Thread Marcelo Tosatti
On Tue, Sep 29, 2009 at 08:56:24AM +0300, Priit Laes wrote:
 Hi!
 
 I've been getting some locking-related warning when I fire up a KVM
 machine. This has been popping up since a 2.6.30 kernel:
 
 [snip]
 [119206.978058] BUG: MAX_LOCK_DEPTH too low!
 [119206.978062] turning off the locking correctness validator.
 [119206.978067] Pid: 9510, comm: kvm Not tainted 2.6.31 #44
 [119206.978071] Call Trace:
 [119206.978080]  [8108e4a9] __lock_acquire+0x94c/0x974
 [119206.978086]  [8108e59d] lock_acquire+0xcc/0xe9
 [119206.978092]  [810e106f] ? mm_take_all_locks+0xd9/0x110
 [119206.978099]  [814af652] ? mutex_lock_nested+0x23b/0x24a
 [119206.978104]  [810e0fd0] ? mm_take_all_locks+0x3a/0x110
 [119206.978109]  [814b05b9] _spin_lock_nest_lock+0x2c/0x3b
 [119206.978114]  [810e106f] ? mm_take_all_locks+0xd9/0x110
 [119206.978119]  [810e106f] mm_take_all_locks+0xd9/0x110
 [119206.978125]  [810efe83] do_mmu_notifier_register
 +0xa8/0x16a
 [119206.978130]  [810eff60] mmu_notifier_register+0xe/0x10
 [119206.978136]  [8100daba] kvm_dev_ioctl+0x136/0x2ed
 [119206.978142]  [811061f9] vfs_ioctl+0x1d/0x82
 [119206.978147]  [81106730] do_vfs_ioctl+0x45b/0x4a1
 [119206.978153]  [811fdb78] ? __up_read+0x9a/0xa2
 [119206.978158]  [8108287b] ? up_read+0x26/0x2a
 [119206.978163]  [811067b8] sys_ioctl+0x42/0x65
 [119206.978169]  [810329eb] system_call_fastpath+0x16/0x1b
 [119216.471426] kvm: emulating exchange as write
 [/snip]
 
 Although, this does not seem to be critical as my guest op-system seems
 to be working fine...

This should fixed in recent kernels.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm tuning guide

2009-09-29 Thread Leandro Quibem Magnabosco

Avi Kivity escreveu:

but a newbie is born every minute.

Reporting in!
As a newbie in KVM, I really appreciate those efforts.
Promise to help on docs when I realize that I know what I am doing. :)
Thank you.
--
Leandro Quibem Magnabosco.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sync guest calls made async on host - SQLite performance

2009-09-29 Thread Anthony Liguori

Matthew Tippett wrote:

Hi,

I would like to call attention to the SQLite performance under KVM in 
the current Ubuntu Alpha.


http://www.phoronix.com/scan.php?page=articleitem=linux_2631_kvmnum=3

SQLite's benchmark as part of the Phoronix Test Suite is typically IO 
limited and is affected by both disk and filesystem performance.


Gotta love Phoronix's transparent methodology...

Ubuntu's Karmic release has _not_ been released yet.  For this 
particular test, Phoronix was probably using an alpha drop before Ubuntu 
switched from kvm-84 to qemu-kvm-0.11.0.


Before 0.11.0, there were known issues with qcow2 and it was not 
recommended for use in production environments.  If you read the release 
notes for 0.10.0, we made this very clear.  Because of some performance 
problems, in 0.10.x we made cache=writeback the default for qcow2.  We 
document this pretty thoroughly.  See 
http://www.qemu.org/qemu-doc.html#SEC10  Some other distros that shipped 
0.10.x made cache=none the default in order to ensure data integrity (at 
the cost of performance).


For 0.11.0, Kevin Wolf has fixed the performance/reliability issues in 
qcow2 and we now set cache=writethrough for qcow2 by default.


And FWIW, Karmic has been on the 0.11.0 tree now for at least a month.

Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sync guest calls made async on host - SQLite performance

2009-09-29 Thread Anthony Liguori

Avi Kivity wrote:

On 09/24/2009 10:49 PM, Matthew Tippett wrote:

The test itself is a simple usage of SQLite.  It is stock KVM as
available in 2.6.31 on Ubuntu Karmic.  So it would be the environment,
not the test.

So assuming that KVM upstream works as expected that would leave
either 2.6.31 having an issue, or Ubuntu having an issue.

Care to make an assertion on the KVM in 2.6.31?  Leaving only Ubuntu's
installation.
   


kvm has nothing to do with it, it's purely qemu.  For a long time qemu 
has defaulted to write-through cacheing.  This can be overridden and 
maybe that's what Ubuntu or Phoronix do.



Can some KVM developers attempt to confirm that a 'correctly'
configured KVM will not demonstrate this behaviour?
http://www.phoronix-test-suite.com/ (or is already available in newer
distributions of Fedora, openSUSE and Ubuntu.
   


A correctly configured kvm will not demonstrate this behaviour.


It was a very old kvm version (kvm-84).  But of course, the version of 
kvm is not mentioned on the Phoronix site...


Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sync guest calls made async on host - SQLite performance

2009-09-29 Thread Anthony Liguori

Matthew Tippett wrote:

First up, Phoronix hasn't tuned.  It's observing the delivered state
by an OS vendor.  I started with what I believe to be the starting
point - KVM.

So the position of the KVM now is that it is either QEMU's
configuration or Ubuntu's configuration.  No further guidance or
suggestions?  Note that the prevailing response here does not see the
10 fold sqlite performance with guest vs host as a problem.
  


Again, this isn't a problem.  Ubuntu updated to a newer package and the 
problem has long been resolved upstream.


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sync guest calls made async on host - SQLite performance

2009-09-29 Thread Anthony Liguori

Matthew Tippett wrote:

I have created a launchpad bug against qemu-kvm in Ubuntu.

https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/437473

Just re-iterating, my concern isn't so much performance, but integrity 
of stock KVM configurations with server or other workloads that expect 
sync fileIO requests to be honored and synchronous to the underlying 
physical disk.


(That and ensuring that sanity reigns where a benchmark doesn't show a 
guest operating 10 times faster than a host for the same test :).


And I've closed it.  In the future, please actually reproduce a bug 
before filing it.  Reading it on a website doesn't mean it's true :-)


Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sync guest calls made async on host - SQLite performance

2009-09-29 Thread Matthew Tippett

Okay, bringing the leafs of the discussions onto this thread.

As per

http://www.phoronix.com/scan.php?page=articleitem=linux_2631_kvmnum=1single=1

The host OS (as well as the guest OS when testing under KVM) was 
running an Ubuntu 9.10 daily snapshot with the Linux 2.6.31 (final) kernel


I am attempting to get the actual daily snapshot to provide the 
precise version.  I should have that information shortly.  It is likely 
that it was within 1-2 weeks prior to the article posting.



 Ubuntu's Karmic release has _not_ been released yet.  For this
 particular test, Phoronix was probably using an alpha drop before
 Ubuntu switched from kvm-84 to qemu-kvm-0.11.0.

The probably was described above - it was a snapshot after the 2.6.31 
final as September 9th, the article was published on September 21st, so 
there is a finite window.


I have high confidence in the testing that Phoronix has done and don't 
expect to need to confirm the results explicitly, and I have pieced 
together the following information.  I should be able to get the actual 
daily build number but broadly it looks like it was


  Ubuntu 9.10 daily snapshot (~ 9th - 21st September)
  Linux 2.6.31 (packaged as 2.6.31-10.30 to 2.6.31-10.32)
  qemu-kvm 0.11 (packaged as 0.11.0~rc2-0ubuntu to 0.11.0~rc2-0ubuntu5

Once I get confirmation of the actual date, digger deeping can occur.



But, if it turned out to be Ubuntu 9.10, linux 2.6.31, qemu-kvm 0.11 
would there be any concerns?




I would prefer rather than riling against Phoronix or the results as 
presented, ask questions to seek further information about what was 
tested rather than writing off all of it as completely invalid.


Regards,

Matthew
 Original Message  
Subject: Re: sync guest calls made async on host - SQLite performance
From: Anthony Liguori anth...@codemonkey.ws
To: Matthew Tippett tippe...@gmail.com
Cc: Avi Kivity a...@redhat.com, RW k...@tauceti.net, kvm@vger.kernel.org
Date: 09/29/2009 03:02 PM


Matthew Tippett wrote:

I have created a launchpad bug against qemu-kvm in Ubuntu.

https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/437473

Just re-iterating, my concern isn't so much performance, but integrity 
of stock KVM configurations with server or other workloads that expect 
sync fileIO requests to be honored and synchronous to the underlying 
physical disk.


(That and ensuring that sanity reigns where a benchmark doesn't show a 
guest operating 10 times faster than a host for the same test :).


And I've closed it.  In the future, please actually reproduce a bug 
before filing it.  Reading it on a website doesn't mean it's true :-)


Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST PATCH 1/6] KVM test: kvm_subprocess: use read_nonblocking(0) instead of read_nonblocking (0.1)

2009-09-29 Thread Michael Goldish
In get_command_status_output() and is_responsive() use read_nonblocking(0) to
read the unread output before sending input (e.g. a command).
The timeout is currently 0.1 because theoretically it should help if the guest
still produces output when the function is called, but in practice there's no
guarantee that a value of 0.1 will suffice.  Therefore, it is be the user's
responsibility to make sure the guest stopped producing output before
get_command_status_output() is called.  This can be guaranteed (in most cases)
by using get_command_status_output() and friends instead of sendline() to send
commands (because the former waits for the prompt to return, whereas the latter
returns immediately).

Signed-off-by: Michael Goldish mgold...@redhat.com
---
 client/tests/kvm/kvm_subprocess.py |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/client/tests/kvm/kvm_subprocess.py 
b/client/tests/kvm/kvm_subprocess.py
index 424c801..730f20e 100755
--- a/client/tests/kvm/kvm_subprocess.py
+++ b/client/tests/kvm/kvm_subprocess.py
@@ -1028,7 +1028,7 @@ class kvm_shell_session(kvm_expect):
 
 # Read all output that's waiting to be read, to make sure the output
 # we read next is in response to the newline sent
-self.read_nonblocking(timeout=0.1)
+self.read_nonblocking(timeout=0)
 # Send a newline
 self.sendline()
 # Wait up to timeout seconds for some output from the child
@@ -1095,7 +1095,7 @@ class kvm_shell_session(kvm_expect):
 logging.debug(Sending command: %s % command)
 
 # Read everything that's waiting to be read
-self.read_nonblocking(0.1)
+self.read_nonblocking(timeout=0)
 
 # Send the command and get its output
 self.sendline(command)
-- 
1.5.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST PATCH 2/6] KVM test: AutoIt and Autotest wrappers: use get_command_output() instead of sendline()

2009-09-29 Thread Michael Goldish
get_command_output() is safer because it waits for the prompt to return.
sendline() returns immediately, and the output generated in response to the
command can appear later and interfere with the test.
For example, the prompt can appear while the Autotest wrapper waits for an
Autotest test to complete, and this will make the wrapper exit as if the test
has completed.

Signed-off-by: Michael Goldish mgold...@redhat.com
---
 client/tests/kvm/tests/autoit.py   |7 +++
 client/tests/kvm/tests/autotest.py |9 -
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/client/tests/kvm/tests/autoit.py b/client/tests/kvm/tests/autoit.py
index fd2a2cb..9435d7c 100644
--- a/client/tests/kvm/tests/autoit.py
+++ b/client/tests/kvm/tests/autoit.py
@@ -30,18 +30,17 @@ def run_autoit(test, params, env):
 
 # Send AutoIt script to guest (this code will be replaced once we
 # support sending files to Windows guests)
-session.sendline(del script.au3)
+session.get_command_output(del script.au3, internal_timeout=0)
 file = open(kvm_utils.get_path(test.bindir, script))
 for line in file.readlines():
 # Insert a '^' before each character
 line = .join(^ + c for c in line.rstrip())
 if line:
 # Append line to the file
-session.sendline(echo %sscript.au3 % line)
+session.get_command_output(echo %sscript.au3 % line,
+   internal_timeout=0)
 file.close()
 
-session.read_up_to_prompt()
-
 command = cmd /c %s script.au3 %s % (binary, script_params)
 
 logging.info( Script output )
diff --git a/client/tests/kvm/tests/autotest.py 
b/client/tests/kvm/tests/autotest.py
index 5c9b2aa..798217d 100644
--- a/client/tests/kvm/tests/autotest.py
+++ b/client/tests/kvm/tests/autotest.py
@@ -85,7 +85,7 @@ def run_autotest(test, params, env):
 extract(vm, autotest.tar.bz2)
 
 # mkdir autotest/tests
-session.sendline(mkdir autotest/tests)
+session.get_command_output(mkdir autotest/tests)
 
 # Extract test_name.tar.bz2 into autotest/tests
 extract(vm, test_name + .tar.bz2, autotest/tests)
@@ -99,10 +99,9 @@ def run_autotest(test, params, env):
 
 # Run the test
 logging.info(Running test '%s'... % test_name)
-session.sendline(cd autotest)
-session.sendline(rm -f control.state)
-session.sendline(rm -rf results/*)
-session.read_up_to_prompt()
+session.get_command_output(cd autotest)
+session.get_command_output(rm -f control.state)
+session.get_command_output(rm -rf results/*)
 logging.info( Test output )
 status = session.get_command_status(bin/autotest control,
 timeout=test_timeout,
-- 
1.5.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST PATCH 4/6] KVM test: step file tests: add parameter timeout_multiplier

2009-09-29 Thread Michael Goldish
This parameter multiplies the timeout values of all the barriers in a step file
test.
It is useful for slower hosts, under load (e.g. when executing multiple tests
in parallel) and for testing QEMU without KVM.  In any of these cases, the
multiplier should be greater than 1 in order to give the test more time to
complete.

In addition to modifying tests/steps.py, this patch adds a usage example to
control and control.parallel.

Signed-off-by: Michael Goldish mgold...@redhat.com
---
 client/tests/kvm/control  |1 +
 client/tests/kvm/control.parallel |1 +
 client/tests/kvm/tests/steps.py   |   22 +++---
 3 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/client/tests/kvm/control b/client/tests/kvm/control
index 72658f1..491870a 100644
--- a/client/tests/kvm/control
+++ b/client/tests/kvm/control
@@ -140,6 +140,7 @@ filename = os.path.join(pwd, kvm_tests.cfg)
 cfg = kvm_config.config(filename)
 
 # If desirable, make changes to the test configuration here.  For example:
+# cfg.parse_string(install|setup: timeout_multiplier = 2)
 # cfg.parse_string(only fc8_quick)
 # cfg.parse_string(display = sdl)
 
diff --git a/client/tests/kvm/control.parallel 
b/client/tests/kvm/control.parallel
index cf268ea..fac0176 100644
--- a/client/tests/kvm/control.parallel
+++ b/client/tests/kvm/control.parallel
@@ -136,6 +136,7 @@ filename = os.path.join(pwd, kvm_tests.cfg)
 cfg = kvm_config.config(filename)
 
 # If desirable, make changes to the test configuration here.  For example:
+# cfg.parse_string(install|setup: timeout_multiplier = 2)
 # cfg.parse_string(only fc8_quick)
 # cfg.parse_string(display = sdl)
 
diff --git a/client/tests/kvm/tests/steps.py b/client/tests/kvm/tests/steps.py
index 8bc85f2..d0b7dbd 100644
--- a/client/tests/kvm/tests/steps.py
+++ b/client/tests/kvm/tests/steps.py
@@ -33,6 +33,21 @@ def barrier_2(vm, words, params, debug_dir, 
data_scrdump_filename,
 cmd, dx, dy, x1, y1, md5sum, timeout = words[:7]
 dx, dy, x1, y1, timeout = map(int, [dx, dy, x1, y1, timeout])
 
+scrdump_filename = os.path.join(debug_dir, scrdump.ppm)
+cropped_scrdump_filename = os.path.join(debug_dir, cropped_scrdump.ppm)
+expected_scrdump_filename = os.path.join(debug_dir, scrdump_expected.ppm)
+expected_cropped_scrdump_filename = os.path.join(debug_dir,
+ 
cropped_scrdump_expected.ppm)
+comparison_filename = os.path.join(debug_dir, comparison.ppm)
+
+# Multiply timeout by the timeout multiplier
+timeout_multiplier = params.get(timeout_multiplier)
+if timeout_multiplier:
+timeout_multiplier = float(timeout_multiplier)
+else:
+timeout_multiplier = 1.0
+timeout *= timeout_multiplier
+
 # Timeout/5 is the time it took stepmaker to complete this step.
 # Divide that number by 10 to poll 10 times, just in case
 # current machine is stronger then the stepmaker machine.
@@ -41,13 +56,6 @@ def barrier_2(vm, words, params, debug_dir, 
data_scrdump_filename,
 if sleep_duration  1.0: sleep_duration = 1.0
 if sleep_duration  10.0: sleep_duration = 10.0
 
-scrdump_filename = os.path.join(debug_dir, scrdump.ppm)
-cropped_scrdump_filename = os.path.join(debug_dir, cropped_scrdump.ppm)
-expected_scrdump_filename = os.path.join(debug_dir, scrdump_expected.ppm)
-expected_cropped_scrdump_filename = os.path.join(debug_dir,
- 
cropped_scrdump_expected.ppm)
-comparison_filename = os.path.join(debug_dir, comparison.ppm)
-
 fail_if_stuck_for = params.get(fail_if_stuck_for)
 if fail_if_stuck_for:
 fail_if_stuck_for = float(fail_if_stuck_for)
-- 
1.5.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST PATCH 3/6] KVM test: kvm_vm.py: change qemu-img timeout to 120 seconds

2009-09-29 Thread Michael Goldish
The timeout of qemu-img commands is currently 30 seconds.
This may not suffice under heavy load (e.g. when multiple tests run in parallel
and use the same physical disk).

Signed-off-by: Michael Goldish mgold...@redhat.com
---
 client/tests/kvm/kvm_vm.py |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/client/tests/kvm/kvm_vm.py b/client/tests/kvm/kvm_vm.py
index 07ceb6d..ff6c044 100755
--- a/client/tests/kvm/kvm_vm.py
+++ b/client/tests/kvm/kvm_vm.py
@@ -55,7 +55,7 @@ def create_image(params, root_dir):
 
 logging.debug(Running qemu-img command:\n%s % qemu_img_cmd)
 (status, output) = kvm_subprocess.run_fg(qemu_img_cmd, logging.debug,
- (qemu-img) , timeout=30)
+ (qemu-img) , timeout=120)
 
 if status is None:
 logging.error(Timeout elapsed while waiting for qemu-img command 
-- 
1.5.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST PATCH 5/6] KVM test: tests/steps.py: simplify barrier_2()

2009-09-29 Thread Michael Goldish
Add a few comments, group similar lines and replace some blocks with one-liners.

Note: the weird form
timeout_multiplier = float(params.get(timeout_multiplier) or 1)
allows the user to fall back to the default value by setting timeout_multiplier
to .
The more common form
timeout_multiplier = float(params.get(timeout_multiplier, 1))
will raise an exception if timeout_multiplier is .

Signed-off-by: Michael Goldish mgold...@redhat.com
---
 client/tests/kvm/tests/steps.py |   32 ++--
 1 files changed, 10 insertions(+), 22 deletions(-)

diff --git a/client/tests/kvm/tests/steps.py b/client/tests/kvm/tests/steps.py
index d0b7dbd..11b2ae1 100644
--- a/client/tests/kvm/tests/steps.py
+++ b/client/tests/kvm/tests/steps.py
@@ -30,22 +30,27 @@ def barrier_2(vm, words, params, debug_dir, 
data_scrdump_filename,
 logging.error(Bad barrier_2 command line)
 return False
 
+# Parse barrier command line
 cmd, dx, dy, x1, y1, md5sum, timeout = words[:7]
 dx, dy, x1, y1, timeout = map(int, [dx, dy, x1, y1, timeout])
 
+# Define some paths
 scrdump_filename = os.path.join(debug_dir, scrdump.ppm)
 cropped_scrdump_filename = os.path.join(debug_dir, cropped_scrdump.ppm)
 expected_scrdump_filename = os.path.join(debug_dir, scrdump_expected.ppm)
 expected_cropped_scrdump_filename = os.path.join(debug_dir,
  
cropped_scrdump_expected.ppm)
 comparison_filename = os.path.join(debug_dir, comparison.ppm)
+history_dir = os.path.join(debug_dir, barrier_history)
+
+# Collect a few parameters
+timeout_multiplier = float(params.get(timeout_multiplier) or 1)
+fail_if_stuck_for = float(params.get(fail_if_stuck_for) or 1e308)
+stuck_detection_history = int(params.get(stuck_detection_history) or 2)
+keep_screendump_history = params.get(keep_screendump_history) == yes
+keep_all_history = params.get(keep_all_history) == yes
 
 # Multiply timeout by the timeout multiplier
-timeout_multiplier = params.get(timeout_multiplier)
-if timeout_multiplier:
-timeout_multiplier = float(timeout_multiplier)
-else:
-timeout_multiplier = 1.0
 timeout *= timeout_multiplier
 
 # Timeout/5 is the time it took stepmaker to complete this step.
@@ -56,23 +61,6 @@ def barrier_2(vm, words, params, debug_dir, 
data_scrdump_filename,
 if sleep_duration  1.0: sleep_duration = 1.0
 if sleep_duration  10.0: sleep_duration = 10.0
 
-fail_if_stuck_for = params.get(fail_if_stuck_for)
-if fail_if_stuck_for:
-fail_if_stuck_for = float(fail_if_stuck_for)
-else:
-fail_if_stuck_for = 1e308
-
-stuck_detection_history = params.get(stuck_detection_history)
-if stuck_detection_history:
-stuck_detection_history = int(stuck_detection_history)
-else:
-stuck_detection_history = 2
-
-keep_screendump_history = params.get(keep_screendump_history) == yes
-if keep_screendump_history:
-keep_all_history = params.get(keep_all_history) == yes
-history_dir = os.path.join(debug_dir, barrier_history)
-
 end_time = time.time() + timeout
 end_time_stuck = time.time() + fail_if_stuck_for
 start_time = time.time()
-- 
1.5.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[KVM-AUTOTEST PATCH 6/6] KVM test: kvm_subprocess: use select() in read_until_output_matches()

2009-09-29 Thread Michael Goldish
Currently read_nonblocking() is called repeatedly until a match is found.
This is fine as long as internal_timeout, the timeout parameter passed to
read_nonblocking(), is greater than zero.  If it equals zero the loop will keep
the CPU busy and stress the host.
To avoid this, use select() to wait until there's output to read from the child
process.

Signed-off-by: Michael Goldish mgold...@redhat.com
---
 client/tests/kvm/kvm_subprocess.py |9 +++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/client/tests/kvm/kvm_subprocess.py 
b/client/tests/kvm/kvm_subprocess.py
index 730f20e..2ac062a 100755
--- a/client/tests/kvm/kvm_subprocess.py
+++ b/client/tests/kvm/kvm_subprocess.py
@@ -848,8 +848,12 @@ class kvm_expect(kvm_tail):
 match = None
 data = 
 
+fd = self._get_fd(expect)
 end_time = time.time() + timeout
-while time.time()  end_time:
+while True:
+r, w, x = select.select([fd], [], [],
+max(0, end_time - time.time()))
+if fd not in r: break
 # Read data from child
 newdata = self.read_nonblocking(internal_timeout)
 # Print it if necessary
@@ -868,7 +872,8 @@ class kvm_expect(kvm_tail):
 done = True
 # Check if child has died
 if not self.is_alive():
-logging.debug(Process terminated with status %s % 
self.get_status())
+logging.debug(Process terminated with status %s %
+  self.get_status())
 done = True
 # Are we done?
 if done: break
-- 
1.5.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sync guest calls made async on host - SQLite performance

2009-09-29 Thread Dustin Kirkland
On Tue, Sep 29, 2009 at 2:32 PM, Matthew Tippett tippe...@gmail.com wrote:
 I would prefer rather than riling against Phoronix or the results as
 presented, ask questions to seek further information about what was tested
 rather than writing off all of it as completely invalid.

Matthew-

If you could please provide very specific instructions that you have
personally used to reproduce this problem on the latest Ubuntu Karmic
qemu-kvm-0.11.0 package, that would help very much.

I have personally tried to reproduce this problem and I don't see the
problem manifested in Ubuntu's qemu-kvm package.

For the technical people on the list, our configure line in Ubuntu
looks like this:

./configure --prefix=/usr --disable-blobs --audio-drv-list=alsa pa
oss sdl --audio-card-list=ac97 es1370 sb16 cs4231a adlib gus ...

I don't think we're doing anything exotic, and we're not carrying any
major patches.  I have submitted each patch that we are carrying
upstream to the KVM and/or QEMU mailing lists as appropriate.

:-Dustin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] qemu-kvm: virtio-net: Re-instate GSO code removed upstream

2009-09-29 Thread Mark McLoughlin
On Tue, 2009-05-05 at 09:56 +0100, Mark McLoughlin wrote:
 This commit:
 
commit 559a8f45f34cc50d1a60b4f67a06614d506b2e01
Subject: Remove stray GSO code from virtio_net (Mark McLoughlin)
 
 Removed some GSO code from upstream qemu.git, but it needs to
 be re-instated in qemu-kvm.git.
 
 Reported-by: Sridhar Samudrala s...@us.ibm.com
 Signed-off-by: Mark McLoughlin mar...@redhat.com
 ---
  hw/virtio-net.c |5 +
  1 files changed, 5 insertions(+), 0 deletions(-)
 
 diff --git a/hw/virtio-net.c b/hw/virtio-net.c
 index ac8e030..e5d7add 100644
 --- a/hw/virtio-net.c
 +++ b/hw/virtio-net.c
 @@ -424,6 +424,11 @@ static int receive_filter(VirtIONet *n, const uint8_t 
 *buf, int size)
  if (n-promisc)
  return 1;
  
 +#ifdef TAP_VNET_HDR
 +if (tap_has_vnet_hdr(n-vc-vlan-first_client))
 +ptr += sizeof(struct virtio_net_hdr);
 +#endif
 +
  if (!memcmp(ptr[12], vlan, sizeof(vlan))) {
  int vid = be16_to_cpup((uint16_t *)(ptr + 14))  0xfff;
  if (!(n-vlans[vid  5]  (1U  (vid  0x1f

I'm not sure[1] how we didn't notice, but this has been broken on the
stable-0.10 branch since 0.10.3; please apply there too

See:

  https://bugzilla.redhat.com/522994

Cheers,
Mark.

[1] - well, one reason is that libvirt doesn't seem to be enabling
vnet_hdr at the moment. That's the next thing to look at

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/1] qemu-kvm: virtio-net: Re-instate GSO code removed upstream

2009-09-29 Thread Mark McLoughlin
On Tue, 2009-09-29 at 21:45 +0100, Mark McLoughlin wrote:
 On Tue, 2009-05-05 at 09:56 +0100, Mark McLoughlin wrote:
  This commit:
  
 commit 559a8f45f34cc50d1a60b4f67a06614d506b2e01
 Subject: Remove stray GSO code from virtio_net (Mark McLoughlin)
  
  Removed some GSO code from upstream qemu.git, but it needs to
  be re-instated in qemu-kvm.git.
  
  Reported-by: Sridhar Samudrala s...@us.ibm.com
  Signed-off-by: Mark McLoughlin mar...@redhat.com
  ---
   hw/virtio-net.c |5 +
   1 files changed, 5 insertions(+), 0 deletions(-)
  
  diff --git a/hw/virtio-net.c b/hw/virtio-net.c
  index ac8e030..e5d7add 100644
  --- a/hw/virtio-net.c
  +++ b/hw/virtio-net.c
  @@ -424,6 +424,11 @@ static int receive_filter(VirtIONet *n, const uint8_t 
  *buf, int size)
   if (n-promisc)
   return 1;
   
  +#ifdef TAP_VNET_HDR
  +if (tap_has_vnet_hdr(n-vc-vlan-first_client))
  +ptr += sizeof(struct virtio_net_hdr);
  +#endif
  +
   if (!memcmp(ptr[12], vlan, sizeof(vlan))) {
   int vid = be16_to_cpup((uint16_t *)(ptr + 14))  0xfff;
   if (!(n-vlans[vid  5]  (1U  (vid  0x1f
 
 I'm not sure[1] how we didn't notice, but this has been broken on the
 stable-0.10 branch since 0.10.3; please apply there too
 
 See:
 
   https://bugzilla.redhat.com/522994
 
 Cheers,
 Mark.
 
 [1] - well, one reason is that libvirt doesn't seem to be enabling
 vnet_hdr at the moment. That's the next thing to look at

Oh, another reason is that this is only a problem with 2.6.30 guests
which enable MAC based receive filtering

Cheers,
Mark.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: sync guest calls made async on host - SQLite performance

2009-09-29 Thread Anthony Liguori

Matthew Tippett wrote:

Okay, bringing the leafs of the discussions onto this thread.

As per

http://www.phoronix.com/scan.php?page=articleitem=linux_2631_kvmnum=1single=1 



The host OS (as well as the guest OS when testing under KVM) was 
running an Ubuntu 9.10 daily snapshot with the Linux 2.6.31 (final) 
kernel


I am attempting to get the actual daily snapshot to provide the 
precise version.  I should have that information shortly.  It is 
likely that it was within 1-2 weeks prior to the article posting.



 Ubuntu's Karmic release has _not_ been released yet.  For this
 particular test, Phoronix was probably using an alpha drop before
 Ubuntu switched from kvm-84 to qemu-kvm-0.11.0.

The probably was described above - it was a snapshot after the 
2.6.31 final as September 9th, the article was published on September 
21st, so there is a finite window.


I have high confidence in the testing that Phoronix has done and don't 
expect to need to confirm the results explicitly,


Your confidence is misplaced apparently.

and I have pieced together the following information.  I should be 
able to get the actual daily build number but broadly it looks like it 
was


  Ubuntu 9.10 daily snapshot (~ 9th - 21st September)
  Linux 2.6.31 (packaged as 2.6.31-10.30 to 2.6.31-10.32)
  qemu-kvm 0.11 (packaged as 0.11.0~rc2-0ubuntu to 0.11.0~rc2-0ubuntu5


That's extremely unlikely.

But, if it turned out to be Ubuntu 9.10, linux 2.6.31, qemu-kvm 0.11 
would there be any concerns?


It's not relevant because it's not qemu-kvm-0.11.

Regards,

Anthony Liguori
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4: kvm 1/4] Code motion. Separate timer intialization into an indepedent function.

2009-09-29 Thread Zachary Amsden
Signed-off-by: Zachary Amsden zams...@redhat.com
---
 arch/x86/kvm/x86.c |   23 +++
 1 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fedac9d..15d2ace 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3116,9 +3116,22 @@ static struct notifier_block 
kvmclock_cpufreq_notifier_block = {
 .notifier_call  = kvmclock_cpufreq_notifier
 };
 
+static void kvm_timer_init(void)
+{
+   int cpu;
+
+   for_each_possible_cpu(cpu)
+   per_cpu(cpu_tsc_khz, cpu) = tsc_khz;
+   if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
+   tsc_khz_ref = tsc_khz;
+   cpufreq_register_notifier(kvmclock_cpufreq_notifier_block,
+ CPUFREQ_TRANSITION_NOTIFIER);
+   }
+}
+
 int kvm_arch_init(void *opaque)
 {
-   int r, cpu;
+   int r;
struct kvm_x86_ops *ops = (struct kvm_x86_ops *)opaque;
 
if (kvm_x86_ops) {
@@ -3150,13 +3163,7 @@ int kvm_arch_init(void *opaque)
kvm_mmu_set_mask_ptes(PT_USER_MASK, PT_ACCESSED_MASK,
PT_DIRTY_MASK, PT64_NX_MASK, 0);
 
-   for_each_possible_cpu(cpu)
-   per_cpu(cpu_tsc_khz, cpu) = tsc_khz;
-   if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
-   tsc_khz_ref = tsc_khz;
-   cpufreq_register_notifier(kvmclock_cpufreq_notifier_block,
- CPUFREQ_TRANSITION_NOTIFIER);
-   }
+   kvm_timer_init();
 
return 0;
 
-- 
1.6.4.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4: kvm 2/4] Kill the confusing tsc_ref_khz and ref_freq variables.

2009-09-29 Thread Zachary Amsden
They are globals, not clearly protected by any ordering or locking, and
vulnerable to various startup races.

Instead, for variable TSC machines, register the cpufreq notifier and get
the TSC frequency directly from the cpufreq machinery.  Not only is it
always right, it is also perfectly accurate, as no error prone measurement
is required.

On such machines, when a new CPU online is brought online, it isn't clear what
frequency it will start with, and it may not correspond to the reference, thus
in hardware_enable we clear the cpu_tsc_khz variable to zero and make sure
it is set before running on a VCPU.

Signed-off-by: Zachary Amsden zams...@redhat.com
---
 arch/x86/kvm/x86.c |   26 --
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 15d2ace..de4ce8f 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1326,6 +1326,8 @@ out:
 void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 {
kvm_x86_ops-vcpu_load(vcpu, cpu);
+   if (unlikely(per_cpu(cpu_tsc_khz, cpu) == 0))
+   per_cpu(cpu_tsc_khz, cpu) = cpufreq_quick_get(cpu);
kvm_request_guest_time_update(vcpu);
 }
 
@@ -3061,9 +3063,6 @@ static void bounce_off(void *info)
/* nothing */
 }
 
-static unsigned int  ref_freq;
-static unsigned long tsc_khz_ref;
-
 static int kvmclock_cpufreq_notifier(struct notifier_block *nb, unsigned long 
val,
 void *data)
 {
@@ -3072,14 +3071,11 @@ static int kvmclock_cpufreq_notifier(struct 
notifier_block *nb, unsigned long va
struct kvm_vcpu *vcpu;
int i, send_ipi = 0;
 
-   if (!ref_freq)
-   ref_freq = freq-old;
-
if (val == CPUFREQ_PRECHANGE  freq-old  freq-new)
return 0;
if (val == CPUFREQ_POSTCHANGE  freq-old  freq-new)
return 0;
-   per_cpu(cpu_tsc_khz, freq-cpu) = cpufreq_scale(tsc_khz_ref, ref_freq, 
freq-new);
+   per_cpu(cpu_tsc_khz, freq-cpu) = freq-new;
 
spin_lock(kvm_lock);
list_for_each_entry(kvm, vm_list, vm_list) {
@@ -3120,12 +3116,14 @@ static void kvm_timer_init(void)
 {
int cpu;
 
-   for_each_possible_cpu(cpu)
-   per_cpu(cpu_tsc_khz, cpu) = tsc_khz;
if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
-   tsc_khz_ref = tsc_khz;
cpufreq_register_notifier(kvmclock_cpufreq_notifier_block,
  CPUFREQ_TRANSITION_NOTIFIER);
+   for_each_online_cpu(cpu)
+   per_cpu(cpu_tsc_khz, cpu) = cpufreq_get(cpu);
+   } else {
+   for_each_possible_cpu(cpu)
+   per_cpu(cpu_tsc_khz, cpu) = tsc_khz;
}
 }
 
@@ -4698,6 +4696,14 @@ int kvm_arch_vcpu_reset(struct kvm_vcpu *vcpu)
 
 int kvm_arch_hardware_enable(void *garbage)
 {
+   /*
+* Since this may be called from a hotplug notifcation,
+* we can't get the CPU frequency directly.
+*/
+   if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC)) {
+   int cpu = raw_smp_processor_id();
+   per_cpu(cpu_tsc_khz, cpu) = 0;
+   }
return kvm_x86_ops-hardware_enable(garbage);
 }
 
-- 
1.6.4.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4: kvm 3/4] Fix printk name error in svm.c

2009-09-29 Thread Zachary Amsden
Signed-off-by: Zachary Amsden zams...@redhat.com
---
 arch/x86/kvm/svm.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 9a4daca..d1036ce 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -330,13 +330,14 @@ static int svm_hardware_enable(void *garbage)
return -EBUSY;
 
if (!has_svm()) {
-   printk(KERN_ERR svm_cpu_init: err EOPNOTSUPP on %d\n, me);
+   printk(KERN_ERR svm_hardware_enable: err EOPNOTSUPP on %d\n,
+  me);
return -EINVAL;
}
svm_data = per_cpu(svm_data, me);
 
if (!svm_data) {
-   printk(KERN_ERR svm_cpu_init: svm_data is NULL on %d\n,
+   printk(KERN_ERR svm_hardware_enable: svm_data is NULL on %d\n,
   me);
return -EINVAL;
}
-- 
1.6.4.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4: kvm 4/4] Fix hotplug of CPUs for KVM.

2009-09-29 Thread Zachary Amsden
Both VMX and SVM require per-cpu memory allocation, which is done at module
init time, for only online cpus.

Backend was not allocating enough structure for all possible CPUs, so
new CPUs coming online could not be hardware enabled.

Signed-off-by: Zachary Amsden zams...@redhat.com
---
 arch/x86/kvm/svm.c |4 ++--
 arch/x86/kvm/vmx.c |6 --
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index d1036ce..02a4269 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -482,7 +482,7 @@ static __init int svm_hardware_setup(void)
kvm_enable_efer_bits(EFER_SVME);
}
 
-   for_each_online_cpu(cpu) {
+   for_each_possible_cpu(cpu) {
r = svm_cpu_init(cpu);
if (r)
goto err;
@@ -516,7 +516,7 @@ static __exit void svm_hardware_unsetup(void)
 {
int cpu;
 
-   for_each_online_cpu(cpu)
+   for_each_possible_cpu(cpu)
svm_cpu_uninit(cpu);
 
__free_pages(pfn_to_page(iopm_base  PAGE_SHIFT), IOPM_ALLOC_ORDER);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 3fe0d42..e86f1a6 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1350,15 +1350,17 @@ static void free_kvm_area(void)
 {
int cpu;
 
-   for_each_online_cpu(cpu)
+   for_each_possible_cpu(cpu) {
free_vmcs(per_cpu(vmxarea, cpu));
+   per_cpu(vmxarea, cpu) = NULL;
+   }
 }
 
 static __init int alloc_kvm_area(void)
 {
int cpu;
 
-   for_each_online_cpu(cpu) {
+   for_each_possible_cpu(cpu) {
struct vmcs *vmcs;
 
vmcs = alloc_vmcs_cpu(cpu);
-- 
1.6.4.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] qemu-kvm-0.11.0 released

2009-09-29 Thread Dustin Kirkland
On Sun, Sep 27, 2009 at 2:42 AM, Avi Kivity a...@redhat.com wrote:
 qemu-kvm-0.11.0 is now available.  This release is is based on the upstream
 qemu 0.11.0, plus kvm-specific enhancements.

Thanks, Avi.

We in Ubuntu have tracked each of the two previous RC's, and we will
have this GA version in Karmic within a day or so.  We're looking
forward to following the stable branch as opposed to the kvm-NN
snapshots we've traditionally tracked.

:-Dustin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] qemu-kvm-0.11.0 released

2009-09-29 Thread Michael Tokarev

Dustin Kirkland wrote:

On Sun, Sep 27, 2009 at 2:42 AM, Avi Kivity a...@redhat.com wrote:

qemu-kvm-0.11.0 is now available.  This release is is based on the upstream
qemu 0.11.0, plus kvm-specific enhancements.


Thanks, Avi.

We in Ubuntu have tracked each of the two previous RC's, and we will
have this GA version in Karmic within a day or so.  We're looking
forward to following the stable branch as opposed to the kvm-NN
snapshots we've traditionally tracked.


By the way, I maintain Debian kvm packages for quite some time
(because debian does not have up-to-date kvm), all are available
here -- http://www.corpit.ru/debian/tls/kvm/.  Anything wrong with
those, have you seen them?  Just... asking :)

/mjt
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix last 2 KR prototyes

2009-09-29 Thread Juan Quintela
Rest of cases are already fixed qemu-upstream

Signed-off-by: Juan Quintela quint...@redhat.com
---
 hw/device-assignment.c |2 +-
 qemu-kvm.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/device-assignment.c b/hw/device-assignment.c
index 46e6471..17d68be 100644
--- a/hw/device-assignment.c
+++ b/hw/device-assignment.c
@@ -748,7 +748,7 @@ AssignedDevInfo *get_assigned_device(int pcibus, int slot)
 /* The pci config space got updated. Check if irq numbers have changed
  * for our devices
  */
-void assigned_dev_update_irqs()
+void assigned_dev_update_irqs(void)
 {
 AssignedDevInfo *adev;

diff --git a/qemu-kvm.c b/qemu-kvm.c
index 6da41d1..5a07156 100644
--- a/qemu-kvm.c
+++ b/qemu-kvm.c
@@ -2232,7 +2232,7 @@ int kvm_arch_init_irq_routing(void)

 extern int no_hpet;

-static int kvm_create_context()
+static int kvm_create_context(void)
 {
 int r;

-- 
1.6.2.5

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] qemu-kvm-0.11.0 released

2009-09-29 Thread Dustin Kirkland
On Wed, 2009-09-30 at 01:48 +0400, Michael Tokarev wrote:
 Dustin Kirkland wrote:
  On Sun, Sep 27, 2009 at 2:42 AM, Avi Kivity a...@redhat.com wrote:
  qemu-kvm-0.11.0 is now available.  This release is is based on the upstream
  qemu 0.11.0, plus kvm-specific enhancements.
  
  Thanks, Avi.
  
  We in Ubuntu have tracked each of the two previous RC's, and we will
  have this GA version in Karmic within a day or so.  We're looking
  forward to following the stable branch as opposed to the kvm-NN
  snapshots we've traditionally tracked.
 
 By the way, I maintain Debian kvm packages for quite some time
 (because debian does not have up-to-date kvm), all are available
 here -- http://www.corpit.ru/debian/tls/kvm/.  Anything wrong with
 those, have you seen them?  Just... asking :)

Hmm, no, I haven't seen those.  I looked for Debian qemu-kvm packages
at:
 * http://packages.qa.debian.org/q/qemu-kvm.html
 * http://mentors.debian.net/cgi-bin/welcome

Didn't find any, and so we decided to roll our own to ensure that we got
qemu-kvm into Karmic.

:-Dustin


signature.asc
Description: This is a digitally signed message part


Release plan for 0.12.0

2009-09-29 Thread Anthony Liguori

Hi,

Now that 0.11.0 is behind us, it's time to start thinking about 0.12.0.

I'd like to do a few things different this time around.  I don't think 
the -rc process went very well as I don't think we got more testing out 
of it.  I'd like to shorten the timeline for 0.12.0 a good bit.  The 
0.10 stable tree got pretty difficult to maintain toward the end of the 
cycle.  We also had a pretty huge amount of change between 0.10 and 0.11 
so I think a shorter cycle is warranted.


I think aiming for early to mid-December would give us roughly a 3 month 
cycle and would align well with some of the Linux distribution cycles.  
I'd like to limit things to a single -rc that lasted only for about a 
week.  This is enough time to fix most of the obvious issues I think.


I'd also like to try to enumerate some features for this release.  
Here's a short list of things I expect to see for this release 
(target-i386 centric).  Please add or comment on items that you'd either 
like to see in the release or are planning on working on.


o VMState conversion -- I expect most of the pc target to be completed
o qdev conversion -- I hope that we'll get most of the pc target 
completely converted to qdev

o storage live migration
o switch to SeaBIOS (need to finish porting features from Bochs)
o switch to gPXE (need to resolve slirp tftp server issue)
o KSM integration
o in-kernel APIC support for KVM
o guest SMP support for KVM
o updates to the default pc machine type

Please add to this list and I'll collect it all and post it somewhere.

Thanks!

--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Release plan for 0.12.0

2009-09-29 Thread Dustin Kirkland
On Tue, Sep 29, 2009 at 6:54 PM, Anthony Liguori aligu...@us.ibm.com wrote:
 Now that 0.11.0 is behind us, it's time to start thinking about 0.12.0.

 I'd like to do a few things different this time around.  I don't think the
 -rc process went very well as I don't think we got more testing out of it.
  I'd like to shorten the timeline for 0.12.0 a good bit.  The 0.10 stable
 tree got pretty difficult to maintain toward the end of the cycle.  We also
 had a pretty huge amount of change between 0.10 and 0.11 so I think a
 shorter cycle is warranted.

 I think aiming for early to mid-December would give us roughly a 3 month
 cycle and would align well with some of the Linux distribution cycles.  I'd
 like to limit things to a single -rc that lasted only for about a week.
  This is enough time to fix most of the obvious issues I think.

As a downstream packager of qemu-kvm, I thought I'd mention that the
next Ubuntu cycle is now public:
 * https://wiki.ubuntu.com/LucidReleaseSchedule

The key date here is Feature Freeze, which is February 25, 2010.
That's the point by which we'd need to have a new qemu-kvm (which of
course is downstream of qemu) package in Ubuntu for the LTS 10.04
release in April 2010.

I'll gladly track the release candidate(s) in the Lucid development
tree, and hopefully pull 0.12 as soon as its available.

And we also provide daily snapshots of qemu builds at:
 * https://edge.launchpad.net/~ubuntu-virt/+archive/virt-daily-upstream

:-Dustin
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [RESEND] KVM:VMX: Add support for Pause-Loop Exiting

2009-09-29 Thread Zhai, Edwin

Avi,
I modify it according your comments. The only thing I want to keep is 
the module param ple_gap/window.  Although they are not per-guest, they 
can be used to find the right value, and disable PLE for debug purpose.


Thanks,


Avi Kivity wrote:

On 09/28/2009 11:33 AM, Zhai, Edwin wrote:
  

Avi Kivity wrote:


+#define KVM_VMX_DEFAULT_PLE_GAP41
+#define KVM_VMX_DEFAULT_PLE_WINDOW 4096
+static int __read_mostly ple_gap = KVM_VMX_DEFAULT_PLE_GAP;
+module_param(ple_gap, int, S_IRUGO);
+
+static int __read_mostly ple_window = KVM_VMX_DEFAULT_PLE_WINDOW;
+module_param(ple_window, int, S_IRUGO);

Shouldn't be __read_mostly since they're read very rarely 
(__read_mostly should be for variables that are very often read, and 
rarely written).
  
In general, they are read only except that experienced user may try 
different parameter for perf tuning.




__read_mostly doesn't just mean it's read mostly.  It also means it's 
read often.  Otherwise it's just wasting space in hot cachelines.


  

I'm not even sure they should be parameters.
  
For different spinlock in different OS, and for different workloads, 
we need different parameter for tuning. It's similar as the enable_ept.



No, global parameters don't work for tuning workloads and guests since 
they cannot be modified on a per-guest basis.  enable_ept is only useful 
for debugging and testing.


  

+set_current_state(TASK_INTERRUPTIBLE);
+schedule_hrtimeout(expires, HRTIMER_MODE_ABS);
+

Please add a tracepoint for this (since it can cause significant 
change in behaviour), 
  
Isn't trace_kvm_exit(exit_reason, ...) enough? We can tell the PLE 
vmexit from other vmexits.



Right.  I thought of the software spinlock detector, but that's another 
problem.


I think you can drop the sleep_time parameter, it can be part of the 
function.  Also kvm_vcpu_sleep() is confusing, we also sleep on halt.  
Please call it kvm_vcpu_on_spin() or something (since that's what the 
guest is doing).


  


kvm_ple_hrtimer_v3.patch
Description: Binary data


Re: Release plan for 0.12.0

2009-09-29 Thread Anthony Liguori

Dustin Kirkland wrote:

On Tue, Sep 29, 2009 at 6:54 PM, Anthony Liguori aligu...@us.ibm.com wrote:
  

Now that 0.11.0 is behind us, it's time to start thinking about 0.12.0.

I'd like to do a few things different this time around.  I don't think the
-rc process went very well as I don't think we got more testing out of it.
 I'd like to shorten the timeline for 0.12.0 a good bit.  The 0.10 stable
tree got pretty difficult to maintain toward the end of the cycle.  We also
had a pretty huge amount of change between 0.10 and 0.11 so I think a
shorter cycle is warranted.

I think aiming for early to mid-December would give us roughly a 3 month
cycle and would align well with some of the Linux distribution cycles.  I'd
like to limit things to a single -rc that lasted only for about a week.
 This is enough time to fix most of the obvious issues I think.



As a downstream packager of qemu-kvm, I thought I'd mention that the
next Ubuntu cycle is now public:
 * https://wiki.ubuntu.com/LucidReleaseSchedule

The key date here is Feature Freeze, which is February 25, 2010.
That's the point by which we'd need to have a new qemu-kvm (which of
course is downstream of qemu) package in Ubuntu for the LTS 10.04
release in April 2010.
  


If we did a December release, then the 0.13 release would probably be in 
the April time frame.  Not really ideal for Lucid so I'd recommend that 
you ship 0.12.  The good news would be that 0.12 should be very stable 
by Feb 25th and since Lucid is an LTS, that's probably a Good Thing.


--
Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


kvm or qemu-kvm?

2009-09-29 Thread Ross Boylan
http://www.linux-kvm.org/page/HOWTO1 says to build kvm I should get the
latest kvm-release.tar.gz.

http://www.linux-kvm.org/page/Downloads says If you want to use the
latest version of KVM kernel modules and supporting userspace, you can
download the latest version from
http://sourceforge.net/project/showfiles.php?group_id=180599.;
That page shows the latest version is qemu-kvm-0.11.0.tar.gz.

The most recent kvm-release.tar.gz appears to be for kvm-88.

So which file should I start from?

Thanks.
Ross Boylan


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] Release plan for 0.12.0

2009-09-29 Thread Isaku Yamahata
On Tue, Sep 29, 2009 at 06:54:53PM -0500, Anthony Liguori wrote:
 Hi,
 
 Now that 0.11.0 is behind us, it's time to start thinking about 0.12.0.
 
 I'd like to do a few things different this time around.  I don't think 
 the -rc process went very well as I don't think we got more testing out 
 of it.  I'd like to shorten the timeline for 0.12.0 a good bit.  The 
 0.10 stable tree got pretty difficult to maintain toward the end of the 
 cycle.  We also had a pretty huge amount of change between 0.10 and 0.11 
 so I think a shorter cycle is warranted.
 
 I think aiming for early to mid-December would give us roughly a 3 month 
 cycle and would align well with some of the Linux distribution cycles.  
 I'd like to limit things to a single -rc that lasted only for about a 
 week.  This is enough time to fix most of the obvious issues I think.
 
 I'd also like to try to enumerate some features for this release.  
 Here's a short list of things I expect to see for this release 
 (target-i386 centric).  Please add or comment on items that you'd either 
 like to see in the release or are planning on working on.
 
 o VMState conversion -- I expect most of the pc target to be completed
 o qdev conversion -- I hope that we'll get most of the pc target 
 completely converted to qdev
 o storage live migration
 o switch to SeaBIOS (need to finish porting features from Bochs)
 o switch to gPXE (need to resolve slirp tftp server issue)
 o KSM integration
 o in-kernel APIC support for KVM
 o guest SMP support for KVM
 o updates to the default pc machine type
 
 Please add to this list and I'll collect it all and post it somewhere.

 o newer chipset (which is based on Q35 chipset)
 o multiple pci bus 
 o PCI express (MMCONFIG)
 o PCI express hot plug (not acpi based)
 o PCI express switch emulator

Although there is no PCIe emulated device at the moment, 
this will be a fundamental infrastructure for PCI express native
direct attach.

thanks,
-- 
yamahata
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: migrate_set_downtime bug

2009-09-29 Thread Glauber Costa
On Tue, Sep 29, 2009 at 06:36:57PM +0200, Dietmar Maurer wrote:
  Also, if this is really the case (buffered), then the bandwidth capping
  part
  of migration is also wrong.
  
  Have you compared the reported bandwidth to your actual bandwith ? I
  suspect
  the source of the problem can be that we're currently ignoring the time
  we take
  to transfer the state of the devices, and maybe it is not negligible.
  
 
 I have a 1GB network (e1000 card), and get values like bwidth=0.98 - which is 
 much too high.
The main reason for not using the whole migration time is that it can lead to 
values
that are not very helpful in situation where the network load changes too much.

Since the problem you pinpointed do exist, I would suggest measuring the 
average load of the last,
say, 10 iterations. How would that work for you?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kvm tuning guide

2009-09-29 Thread Nikola Ciprich
The default, IDE, is highly supported by guests but may be slow, especially 
with disk arrays. If your guest supports it, use the virtio interface:
Avi,
what is the status of data integrity issues Chris Hellwig summarized some time 
ago?
Is it safe to recommend virtio to newbies already? Shouldn't SCSI
be safer (where applicable)?
nik


On Tue, Sep 29, 2009 at 07:30:55PM +0200, Avi Kivity wrote:
 I wrote a short tuning guide for kvm,  
 http://www.linux-kvm.org/page/Tuning_KVM.  It should all be well known  
 to the list, but a newbie is born every minute.  Please review and 
 expand!

 -- 
 error compiling committee.c: too many arguments to function

 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
-
Nikola CIPRICH
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:   +420 596 603 142
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Release plan for 0.12.0

2009-09-29 Thread Amit Shah
On (Tue) Sep 29 2009 [18:54:53], Anthony Liguori wrote:
 Hi,

 Now that 0.11.0 is behind us, it's time to start thinking about 0.12.0.

 I'd like to do a few things different this time around.  I don't think  
 the -rc process went very well as I don't think we got more testing out  
 of it.  I'd like to shorten the timeline for 0.12.0 a good bit.  The  
 0.10 stable tree got pretty difficult to maintain toward the end of the  
 cycle.  We also had a pretty huge amount of change between 0.10 and 0.11  
 so I think a shorter cycle is warranted.

 I think aiming for early to mid-December would give us roughly a 3 month  
 cycle and would align well with some of the Linux distribution cycles.   
 I'd like to limit things to a single -rc that lasted only for about a  
 week.  This is enough time to fix most of the obvious issues I think.

 I'd also like to try to enumerate some features for this release.   
 Here's a short list of things I expect to see for this release  
 (target-i386 centric).  Please add or comment on items that you'd either  
 like to see in the release or are planning on working on.

 o VMState conversion -- I expect most of the pc target to be completed
 o qdev conversion -- I hope that we'll get most of the pc target  
 completely converted to qdev
 o storage live migration
 o switch to SeaBIOS (need to finish porting features from Bochs)
 o switch to gPXE (need to resolve slirp tftp server issue)
 o KSM integration
 o in-kernel APIC support for KVM
 o guest SMP support for KVM
 o updates to the default pc machine type

  o multiport virtio-console support

 Please add to this list and I'll collect it all and post it somewhere.

Thanks,
Amit
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/27] Pass PVR in sregs

2009-09-29 Thread Alexander Graf
Right now sregs is unused on PPC, so we can use it for initialization
of the CPU.

KVM on BookE always virtualizes the host CPU. On Book3s we go a step further
and take the PVR from userspace that tells us what kind of CPU we are supposed
to virtualize, because we support Book3s_32 and Book3s_64 guests.

In order to get that information, we use the sregs ioctl, because we don't
want to reset the guest CPU on every normal register set.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm.h b/arch/powerpc/include/asm/kvm.h
index bb2de6a..b82bd68 100644
--- a/arch/powerpc/include/asm/kvm.h
+++ b/arch/powerpc/include/asm/kvm.h
@@ -46,6 +46,8 @@ struct kvm_regs {
 };
 
 struct kvm_sregs {
+   __u64 pvr;
+   char pad[1016];
 };
 
 struct kvm_fpu {
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/27] Add Book3s fields to vcpu structs

2009-09-29 Thread Alexander Graf
We need to store more information than we currently have for vcpus
when running on Book3s.

So let's extend the internal struct definitions.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - use context_id instead of mm_context
---
 arch/powerpc/include/asm/kvm_host.h |   75 ++-
 1 files changed, 74 insertions(+), 1 deletions(-)

diff --git a/arch/powerpc/include/asm/kvm_host.h 
b/arch/powerpc/include/asm/kvm_host.h
index c9c930e..8422027 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -37,6 +37,8 @@
 #define KVM_NR_PAGE_SIZES  1
 #define KVM_PAGES_PER_HPAGE(x) (1UL31)
 
+#define HPTEG_CACHE_NUM 1024
+
 struct kvm;
 struct kvm_run;
 struct kvm_vcpu;
@@ -63,6 +65,17 @@ struct kvm_vcpu_stat {
u32 dec_exits;
u32 ext_intr_exits;
u32 halt_wakeup;
+#ifdef CONFIG_PPC64
+   u32 pf_storage;
+   u32 pf_instruc;
+   u32 sp_storage;
+   u32 sp_instruc;
+   u32 queue_intr;
+   u32 ld;
+   u32 ld_slow;
+   u32 st;
+   u32 st_slow;
+#endif
 };
 
 enum kvm_exit_types {
@@ -109,9 +122,53 @@ struct kvmppc_exit_timing {
 struct kvm_arch {
 };
 
+struct kvmppc_pte {
+   u64 eaddr;
+   u64 vpage;
+   u64 raddr;
+   bool may_read;
+   bool may_write;
+   bool may_execute;
+};
+
+struct kvmppc_mmu {
+   /* book3s_64 only */
+   void (*slbmte)(struct kvm_vcpu *vcpu, u64 rb, u64 rs);
+   u64  (*slbmfee)(struct kvm_vcpu *vcpu, u64 slb_nr);
+   u64  (*slbmfev)(struct kvm_vcpu *vcpu, u64 slb_nr);
+   void (*slbie)(struct kvm_vcpu *vcpu, u64 slb_nr);
+   void (*slbia)(struct kvm_vcpu *vcpu);
+   /* book3s */
+   void (*mtsrin)(struct kvm_vcpu *vcpu, u32 srnum, ulong value);
+   u32  (*mfsrin)(struct kvm_vcpu *vcpu, u32 srnum);
+   int  (*xlate)(struct kvm_vcpu *vcpu, gva_t eaddr, struct kvmppc_pte 
*pte, bool data);
+   void (*reset_msr)(struct kvm_vcpu *vcpu);
+   void (*tlbie)(struct kvm_vcpu *vcpu, ulong addr, bool large);
+   int  (*esid_to_vsid)(struct kvm_vcpu *vcpu, u64 esid, u64 *vsid);
+   u64  (*ea_to_vp)(struct kvm_vcpu *vcpu, gva_t eaddr, bool data);
+   bool (*is_dcbz32)(struct kvm_vcpu *vcpu);
+};
+
+struct hpte_cache {
+   u64 host_va;
+   u64 pfn;
+   ulong slot;
+   struct kvmppc_pte pte;
+};
+
 struct kvm_vcpu_arch {
-   u32 host_stack;
+   ulong host_stack;
u32 host_pid;
+#ifdef CONFIG_PPC64
+   ulong host_msr;
+   ulong host_r2;
+   void *host_retip;
+   ulong trampoline_lowmem;
+   ulong trampoline_enter;
+   ulong highmem_handler;
+   ulong host_paca_phys;
+   struct kvmppc_mmu mmu;
+#endif
 
u64 fpr[32];
ulong gpr[32];
@@ -123,6 +180,10 @@ struct kvm_vcpu_arch {
ulong xer;
 
ulong msr;
+#ifdef CONFIG_PPC64
+   ulong shadow_msr;
+   ulong hflags;
+#endif
u32 mmucr;
ulong sprg0;
ulong sprg1;
@@ -149,6 +210,9 @@ struct kvm_vcpu_arch {
u32 ivor[64];
ulong ivpr;
u32 pir;
+#ifdef CONFIG_PPC64
+   u32 pvr;
+#endif
 
u32 shadow_pid;
u32 pid;
@@ -174,6 +238,9 @@ struct kvm_vcpu_arch {
 #endif
 
u32 last_inst;
+#ifdef CONFIG_PPC64
+   ulong fault_dsisr;
+#endif
ulong fault_dear;
ulong fault_esr;
gpa_t paddr_accessed;
@@ -186,7 +253,13 @@ struct kvm_vcpu_arch {
u32 cpr0_cfgaddr; /* holds the last set cpr0_cfgaddr */
 
struct timer_list dec_timer;
+   u64 dec_jiffies;
unsigned long pending_exceptions;
+
+#ifdef CONFIG_PPC64
+   struct hpte_cache hpte_cache[HPTEG_CACHE_NUM];
+   int hpte_cache_offset;
+#endif
 };
 
 #endif /* __POWERPC_KVM_HOST_H__ */
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/27] Move dirty logging code to sub-arch

2009-09-29 Thread Alexander Graf
PowerPC code handles dirty logging in the generic parts atm. While this
is great for return -ENOTSUPP, we need to be rather target specific
when actually implementing it.

So let's split it to implementation specific code, so we can implement
it for book3s.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/booke.c   |5 +
 arch/powerpc/kvm/powerpc.c |5 -
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/booke.c b/arch/powerpc/kvm/booke.c
index e7bf4d0..06f5a9e 100644
--- a/arch/powerpc/kvm/booke.c
+++ b/arch/powerpc/kvm/booke.c
@@ -520,6 +520,11 @@ int kvm_arch_vcpu_ioctl_translate(struct kvm_vcpu *vcpu,
return kvmppc_core_vcpu_translate(vcpu, tr);
 }
 
+int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
+{
+   return -ENOTSUPP;
+}
+
 int __init kvmppc_booke_init(void)
 {
unsigned long ivor[16];
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 5902bbc..4ae3490 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -410,11 +410,6 @@ out:
return r;
 }
 
-int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
-{
-   return -ENOTSUPP;
-}
-
 long kvm_arch_vm_ioctl(struct file *filp,
unsigned int ioctl, unsigned long arg)
 {
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 00/27] Add KVM support for Book3s_64 (PPC64) hosts v4

2009-09-29 Thread Alexander Graf
KVM for PowerPC only supports embedded cores at the moment.

While it makes sense to virtualize on small machines, it's even more fun
to do so on big boxes. So I figured we need KVM for PowerPC64 as well.

This patchset implements KVM support for Book3s_64 hosts and guest support
for Book3s_64 and G3/G4.

To really make use of this, you also need a recent version of qemu.


Don't want to apply patches? Get the git tree!

$ git clone git://csgraf.de/kvm
$ git checkout origin/ppc-v4

V1 - V2:

 - extend sregs with padding
 - new naming scheme (ppc64 - book3s_64; 74xx - book3s_32)
 - to_phys - in-kernel tophys()
 - loadimm - LOAD_REG_IMMEDIATE
 - call .ko kvm.ko
 - set magic paca bit later
 - run guest code with PACA-soft_enabled=true
 - pt_regs for host state saving (guest too?)
 - only do HV dcbz trick on 970
 - refuse to run on LPAR because of missing SLB pieces

V2 - V3:

 - fix DAR/DSISR saving
 - allow running on LPAR by modifying the SLB shadow
 - change the SLB implementation to use a mem-backed cache and do
   full world switch on enter/exit. gets rid of context magic
 - be more aggressive about DEC injection
 - remove fast ld/st because we're always in host context
 - don't use SPRGs in real-paged transition
 - implement dirty log
 - remove MMIO speedup code
 - SPRG cleanup
   - rename SPRG3 - SPRN_SPRG_PACA
   - rename SPRG1 - SPRN_SPRG_SCRATCH0
   - don't use SPRG2

V3 - V4:

 - use context_id instead of mm_alloc
 - export less

TODO:

 - use MMU Notifiers

Alexander Graf (27):
  Move dirty logging code to sub-arch
  Pass PVR in sregs
  Add Book3s definitions
  Add Book3s fields to vcpu structs
  Add asm/kvm_book3s.h
  Add Book3s_64 intercept helpers
  Add book3s_64 highmem asm code
  Add SLB switching code for entry/exit
  Add interrupt handling code
  Add book3s.c
  Add book3s_64 Host MMU handling
  Add book3s_64 guest MMU
  Add book3s_32 guest MMU
  Add book3s_64 specific opcode emulation
  Add mfdec emulation
  Add desktop PowerPC specific emulation
  Make head_64.S aware of KVM real mode code
  Add Book3s_64 offsets to asm-offsets.c
  Export symbols for KVM module
  Split init_new_context and destroy_context
  Export KVM symbols for module
  Add fields to PACA
  Export new PACA constants in asm-offsets
  Include Book3s_64 target in buildsystem
  Fix trace.h
  Enable 32bit dirty log pointers on 64bit host
  Use Little Endian for Dirty Bitmap

 arch/powerpc/include/asm/exception-64s.h |2 +
 arch/powerpc/include/asm/kvm.h   |2 +
 arch/powerpc/include/asm/kvm_asm.h   |   39 ++
 arch/powerpc/include/asm/kvm_book3s.h|  136 
 arch/powerpc/include/asm/kvm_book3s_64_asm.h |   58 ++
 arch/powerpc/include/asm/kvm_host.h  |   75 ++-
 arch/powerpc/include/asm/kvm_ppc.h   |1 +
 arch/powerpc/include/asm/mmu_context.h   |5 +
 arch/powerpc/include/asm/paca.h  |9 +
 arch/powerpc/kernel/asm-offsets.c|   18 +
 arch/powerpc/kernel/exceptions-64s.S |8 +
 arch/powerpc/kernel/head_64.S|7 +
 arch/powerpc/kernel/ppc_ksyms.c  |3 +-
 arch/powerpc/kernel/time.c   |1 +
 arch/powerpc/kvm/Kconfig |   17 +
 arch/powerpc/kvm/Makefile|   27 +-
 arch/powerpc/kvm/book3s.c|  919 ++
 arch/powerpc/kvm/book3s_32_mmu.c |  354 ++
 arch/powerpc/kvm/book3s_64_emulate.c |  338 ++
 arch/powerpc/kvm/book3s_64_exports.c |   24 +
 arch/powerpc/kvm/book3s_64_interrupts.S  |  392 +++
 arch/powerpc/kvm/book3s_64_mmu.c |  469 +
 arch/powerpc/kvm/book3s_64_mmu_host.c|  412 
 arch/powerpc/kvm/book3s_64_rmhandlers.S  |  131 
 arch/powerpc/kvm/book3s_64_slb.S |  277 
 arch/powerpc/kvm/booke.c |5 +
 arch/powerpc/kvm/emulate.c   |   43 ++-
 arch/powerpc/kvm/powerpc.c   |5 -
 arch/powerpc/kvm/trace.h |6 +-
 arch/powerpc/mm/hash_utils_64.c  |2 +
 arch/powerpc/mm/mmu_context_hash64.c |   24 +-
 virt/kvm/kvm_main.c  |   10 +-
 32 files changed, 3799 insertions(+), 20 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s.h
 create mode 100644 arch/powerpc/include/asm/kvm_book3s_64_asm.h
 create mode 100644 arch/powerpc/kvm/book3s.c
 create mode 100644 arch/powerpc/kvm/book3s_32_mmu.c
 create mode 100644 arch/powerpc/kvm/book3s_64_emulate.c
 create mode 100644 arch/powerpc/kvm/book3s_64_exports.c
 create mode 100644 arch/powerpc/kvm/book3s_64_interrupts.S
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu.c
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu_host.c
 create mode 100644 arch/powerpc/kvm/book3s_64_rmhandlers.S
 create mode 100644 arch/powerpc/kvm/book3s_64_slb.S

--
To unsubscribe from this list: send the line unsubscribe 

[PATCH 06/27] Add Book3s_64 intercept helpers

2009-09-29 Thread Alexander Graf
We need to intercept interrupt vectors. To do that, let's add a file
we can always include which only activates the intercepts when we have
then configured.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/kvm_book3s_64_asm.h |   58 ++
 1 files changed, 58 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s_64_asm.h

diff --git a/arch/powerpc/include/asm/kvm_book3s_64_asm.h 
b/arch/powerpc/include/asm/kvm_book3s_64_asm.h
new file mode 100644
index 000..2e06ee8
--- /dev/null
+++ b/arch/powerpc/include/asm/kvm_book3s_64_asm.h
@@ -0,0 +1,58 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#ifndef __ASM_KVM_BOOK3S_ASM_H__
+#define __ASM_KVM_BOOK3S_ASM_H__
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+
+#include asm/kvm_asm.h
+
+.macro DO_KVM intno
+   .if (\intno == BOOK3S_INTERRUPT_SYSTEM_RESET) || \
+   (\intno == BOOK3S_INTERRUPT_MACHINE_CHECK) || \
+   (\intno == BOOK3S_INTERRUPT_DATA_STORAGE) || \
+   (\intno == BOOK3S_INTERRUPT_INST_STORAGE) || \
+   (\intno == BOOK3S_INTERRUPT_DATA_SEGMENT) || \
+   (\intno == BOOK3S_INTERRUPT_INST_SEGMENT) || \
+   (\intno == BOOK3S_INTERRUPT_EXTERNAL) || \
+   (\intno == BOOK3S_INTERRUPT_ALIGNMENT) || \
+   (\intno == BOOK3S_INTERRUPT_PROGRAM) || \
+   (\intno == BOOK3S_INTERRUPT_FP_UNAVAIL) || \
+   (\intno == BOOK3S_INTERRUPT_DECREMENTER) || \
+   (\intno == BOOK3S_INTERRUPT_SYSCALL) || \
+   (\intno == BOOK3S_INTERRUPT_TRACE) || \
+   (\intno == BOOK3S_INTERRUPT_PERFMON) || \
+   (\intno == BOOK3S_INTERRUPT_ALTIVEC) || \
+   (\intno == BOOK3S_INTERRUPT_VSX)
+
+   b   kvmppc_trampoline_\intno
+kvmppc_resume_\intno:
+
+   .endif
+.endm
+
+#else
+
+.macro DO_KVM intno
+.endm
+
+#endif /* CONFIG_KVM_BOOK3S_64_HANDLER */
+
+#endif /* __ASM_KVM_BOOK3S_ASM_H__ */
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/27] Add asm/kvm_book3s.h

2009-09-29 Thread Alexander Graf
This adds the book3s specific header file that contains structs that
are only valid on book3s specific code.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - use context_id instead of mm_alloc
---
 arch/powerpc/include/asm/kvm_book3s.h |  136 +
 1 files changed, 136 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/include/asm/kvm_book3s.h

diff --git a/arch/powerpc/include/asm/kvm_book3s.h 
b/arch/powerpc/include/asm/kvm_book3s.h
new file mode 100644
index 000..c601133
--- /dev/null
+++ b/arch/powerpc/include/asm/kvm_book3s.h
@@ -0,0 +1,136 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#ifndef __ASM_KVM_BOOK3S_H__
+#define __ASM_KVM_BOOK3S_H__
+
+#include linux/types.h
+#include linux/kvm_host.h
+#include asm/kvm_ppc.h
+
+struct kvmppc_slb {
+   u64 esid;
+   u64 vsid;
+   u64 orige;
+   u64 origv;
+   bool valid;
+   bool Ks;
+   bool Kp;
+   bool nx;
+   bool large;
+   bool class;
+};
+
+struct kvmppc_sr {
+   u32 raw;
+   u32 vsid;
+   bool Ks;
+   bool Kp;
+   bool nx;
+};
+
+struct kvmppc_bat {
+   u32 bepi;
+   u32 bepi_mask;
+   bool vs;
+   bool vp;
+   u32 brpn;
+   u8 wimg;
+   u8 pp;
+};
+
+struct kvmppc_sid_map {
+   u64 guest_vsid;
+   u64 guest_esid;
+   u64 host_vsid;
+   bool valid;
+};
+
+#define SID_MAP_BITS9
+#define SID_MAP_NUM (1  SID_MAP_BITS)
+#define SID_MAP_MASK(SID_MAP_NUM - 1)
+
+struct kvmppc_vcpu_book3s {
+   struct kvm_vcpu vcpu;
+   struct kvmppc_sid_map sid_map[SID_MAP_NUM];
+   struct kvmppc_slb slb[64];
+   struct {
+   u64 esid;
+   u64 vsid;
+   } slb_shadow[64];
+   u8 slb_shadow_max;
+   struct kvmppc_sr sr[16];
+   struct kvmppc_bat ibat[8];
+   struct kvmppc_bat dbat[8];
+   u64 hid[6];
+   int slb_nr;
+   u64 sdr1;
+   u64 dsisr;
+   u64 hior;
+   u64 msr_mask;
+   u64 vsid_first;
+   u64 vsid_next;
+   u64 vsid_max;
+   int context_id;
+};
+
+#define CONTEXT_HOST   0
+#define CONTEXT_GUEST  1
+#define CONTEXT_GUEST_END  2
+
+#define VSID_REAL  0xfff0
+#define VSID_REAL_DR   0xffe0
+#define VSID_REAL_IR   0xffd0
+#define VSID_BAT   0xffc0
+#define VSID_PR0x8000
+
+extern void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, u64 ea, u64 ea_mask);
+extern void kvmppc_mmu_pte_vflush(struct kvm_vcpu *vcpu, u64 vp, u64 vp_mask);
+extern void kvmppc_mmu_pte_pflush(struct kvm_vcpu *vcpu, u64 pa_start, u64 
pa_end);
+extern void kvmppc_set_msr(struct kvm_vcpu *vcpu, u64 new_msr);
+extern void kvmppc_mmu_book3s_64_init(struct kvm_vcpu *vcpu);
+extern void kvmppc_mmu_book3s_32_init(struct kvm_vcpu *vcpu);
+extern int kvmppc_mmu_map_page(struct kvm_vcpu *vcpu, struct kvmppc_pte *pte);
+extern int kvmppc_mmu_map_segment(struct kvm_vcpu *vcpu, ulong eaddr);
+extern void kvmppc_mmu_flush_segments(struct kvm_vcpu *vcpu);
+extern struct kvmppc_pte *kvmppc_mmu_find_pte(struct kvm_vcpu *vcpu, u64 ea, 
bool data);
+extern int kvmppc_ld(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr, 
bool data);
+extern int kvmppc_st(struct kvm_vcpu *vcpu, ulong eaddr, int size, void *ptr);
+extern void kvmppc_book3s_queue_irqprio(struct kvm_vcpu *vcpu, unsigned int 
vec);
+
+extern u32 kvmppc_trampoline_lowmem;
+extern u32 kvmppc_trampoline_enter;
+
+static inline struct kvmppc_vcpu_book3s *to_book3s(struct kvm_vcpu *vcpu)
+{
+   return container_of(vcpu, struct kvmppc_vcpu_book3s, vcpu);
+}
+
+static inline ulong dsisr(void)
+{
+   ulong r;
+   asm ( mfdsisr %0  : =r (r) );
+   return r;
+}
+
+extern void kvm_return_point(void);
+
+#define INS_DCBZ   0x7c0007ec
+
+#endif /* __ASM_KVM_BOOK3S_H__ */
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 16/27] Add desktop PowerPC specific emulation

2009-09-29 Thread Alexander Graf
Little opcodes behave differently on desktop and embedded PowerPC cores.
In order to reflect those differences, let's add some #ifdef code to emulate.c.

We could probably also handle them in the core specific emulation files, but I
would prefer to reuse as much code as possible.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/emulate.c |   30 ++
 1 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kvm/emulate.c b/arch/powerpc/kvm/emulate.c
index 50d411d..665fa83 100644
--- a/arch/powerpc/kvm/emulate.c
+++ b/arch/powerpc/kvm/emulate.c
@@ -32,6 +32,7 @@
 #include trace.h
 
 #define OP_TRAP 3
+#define OP_TRAP_64 2
 
 #define OP_31_XOP_LWZX  23
 #define OP_31_XOP_LBZX  87
@@ -68,7 +69,19 @@ void kvmppc_emulate_dec(struct kvm_vcpu *vcpu)
 {
unsigned long nr_jiffies;
 
+#ifdef CONFIG_PPC64
+#ifdef DEBUG_EMUL
+   printk(KERN_INFO mtDEC: %x\n, vcpu-arch.dec);
+#endif
+   /* POWER4+ triggers a dec interrupt if the value is  0 */
+   if (vcpu-arch.dec  0x8000) {
+   del_timer(vcpu-arch.dec_timer);
+   kvmppc_core_queue_dec(vcpu);
+   }
+   else if (true) {
+#else
if (vcpu-arch.tcr  TCR_DIE) {
+#endif
/* The decrementer ticks at the same rate as the timebase, so
 * that's how we convert the guest DEC value to the number of
 * host ticks. */
@@ -113,9 +126,16 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
/* this default type might be overwritten by subcategories */
kvmppc_set_exit_type(vcpu, EMULATED_INST_EXITS);
 
+#ifdef DEBUG_EMUL
+   printk(KERN_INFO Emulating opcode %d / %d\n, get_op(inst), 
get_xop(inst));
+#endif
switch (get_op(inst)) {
case OP_TRAP:
+#ifdef CONFIG_PPC64
+   case OP_TRAP_64:
+#else
vcpu-arch.esr |= ESR_PTR;
+#endif
kvmppc_core_queue_program(vcpu);
advance = 0;
break;
@@ -189,6 +209,13 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
vcpu-arch.gpr[rt] = vcpu-arch.srr0; break;
case SPRN_SRR1:
vcpu-arch.gpr[rt] = vcpu-arch.srr1; break;
+#ifdef CONFIG_PPC64
+   case SPRN_PVR:
+   vcpu-arch.gpr[rt] = vcpu-arch.pvr; break;
+   case SPRN_PIR:
+   case SPRN_MSSSR0:
+   vcpu-arch.gpr[rt] = 0; break;
+#else
case SPRN_PVR:
vcpu-arch.gpr[rt] = mfspr(SPRN_PVR); break;
case SPRN_PIR:
@@ -201,6 +228,7 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
vcpu-arch.gpr[rt] = mftbl(); break;
case SPRN_TBWU:
vcpu-arch.gpr[rt] = mftbu(); break;
+#endif
 
case SPRN_SPRG0:
vcpu-arch.gpr[rt] = vcpu-arch.sprg0; break;
@@ -271,6 +299,8 @@ int kvmppc_emulate_instruction(struct kvm_run *run, struct 
kvm_vcpu *vcpu)
case SPRN_TBWL: break;
case SPRN_TBWU: break;
 
+   case SPRN_MSSSR0: break;
+
case SPRN_DEC:
vcpu-arch.dec = vcpu-arch.gpr[rs];
kvmppc_emulate_dec(vcpu);
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/27] Add SLB switching code for entry/exit

2009-09-29 Thread Alexander Graf
This is the really low level of guest entry/exit code.

Book3s_64 has an SLB, which stores all ESID - VSID mappings we're
currently aware of.

The segments in the guest differ from the ones on the host, so we need
to switch the SLB to tell the MMU that we're in a new context.

So we store a shadow of the guest's SLB in the PACA, switch to that on
entry and only restore bolted entries on exit, leaving the rest to the
Linux SLB fault handler.

That way we get a really clean way of switching the SLB.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_slb.S |  277 ++
 1 files changed, 277 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_slb.S

diff --git a/arch/powerpc/kvm/book3s_64_slb.S b/arch/powerpc/kvm/book3s_64_slb.S
new file mode 100644
index 000..00a8367
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_slb.S
@@ -0,0 +1,277 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+/**
+ **
+ *   Entry code   *
+ **
+ */
+
+.global kvmppc_handler_trampoline_enter
+kvmppc_handler_trampoline_enter:
+
+   /* Required state:
+*
+* MSR = ~IR|DR
+* R13 = PACA
+* R9 = guest IP
+* R10 = guest MSR
+* R11 = free
+* R12 = free
+* PACA[PACA_EXMC + EX_R9] = guest R9
+* PACA[PACA_EXMC + EX_R10] = guest R10
+* PACA[PACA_EXMC + EX_R11] = guest R11
+* PACA[PACA_EXMC + EX_R12] = guest R12
+* PACA[PACA_EXMC + EX_R13] = guest R13
+* PACA[PACA_EXMC + EX_CCR] = guest CR
+* PACA[PACA_EXMC + EX_R3] = guest XER
+*/
+
+   mtsrr0  r9
+   mtsrr1  r10
+
+   mtspr   SPRN_SPRG_SCRATCH0, r0
+
+   /* Remove LPAR shadow entries */
+
+#if SLB_NUM_BOLTED == 3
+
+   ld  r12, PACA_SLBSHADOWPTR(r13)
+   ld  r10, 0x10(r12)
+   ld  r11, 0x18(r12)
+   /* Invalid? Skip. */
+   rldicl. r0, r10, 37, 63
+   beq slb_entry_skip_1
+   xoris   r9, r10, slb_esi...@h
+   std r9, 0x10(r12)
+slb_entry_skip_1:
+   ld  r9, 0x20(r12)
+   /* Invalid? Skip. */
+   rldicl. r0, r9, 37, 63
+   beq slb_entry_skip_2
+   xoris   r9, r9, slb_esi...@h
+   std r9, 0x20(r12)
+slb_entry_skip_2:
+   ld  r9, 0x30(r12)
+   /* Invalid? Skip. */
+   rldicl. r0, r9, 37, 63
+   beq slb_entry_skip_3
+   xoris   r9, r9, slb_esi...@h
+   std r9, 0x30(r12)
+slb_entry_skip_3:
+   
+#else
+#error unknown number of bolted entries
+#endif
+
+   /* Flush SLB */
+
+   slbia
+
+   /* r0 = esid  ESID_MASK */
+   rldicr  r10, r10, 0, 35
+   /* r0 |= CLASS_BIT(VSID) */
+   rldic   r12, r11, 56 - 36, 36
+   or  r10, r10, r12
+   slbie   r10
+
+   isync
+
+   /* Fill SLB with our shadow */
+
+   lbz r12, PACA_KVM_SLB_MAX(r13)
+   mulli   r12, r12, 16
+   addir12, r12, PACA_KVM_SLB
+   add r12, r12, r13
+
+   /* for (r11 = kvm_slb; r11  kvm_slb + kvm_slb_size; r11+=slb_entry) */
+   li  r11, PACA_KVM_SLB
+   add r11, r11, r13
+
+slb_loop_enter:
+
+   ld  r10, 0(r11)
+
+   rldicl. r0, r10, 37, 63
+   beq slb_loop_enter_skip
+
+   ld  r9, 8(r11)
+   slbmte  r9, r10
+
+slb_loop_enter_skip:
+   addir11, r11, 16
+   cmpdcr0, r11, r12
+   blt slb_loop_enter
+
+slb_do_enter:
+
+   /* Enter guest */
+
+   mfspr   r0, SPRN_SPRG_SCRATCH0
+
+   ld  r9, (PACA_EXMC+EX_R9)(r13)
+   ld  r10, (PACA_EXMC+EX_R10)(r13)
+   ld  r12, (PACA_EXMC+EX_R12)(r13)
+
+   lwz r11, (PACA_EXMC+EX_CCR)(r13)
+   mtcrr11
+
+   ld  r11, (PACA_EXMC+EX_R3)(r13)
+   mtxer   r11
+
+   ld  r11, (PACA_EXMC+EX_R11)(r13)
+   ld  r13, (PACA_EXMC+EX_R13)(r13)
+
+   RFI
+kvmppc_handler_trampoline_enter_end:
+

[PATCH 07/27] Add book3s_64 highmem asm code

2009-09-29 Thread Alexander Graf
This is the of entry / exit code. In order to switch between host and guest
context, we need to switch register state and call the exit code handler on
exit.

This assembly file does exactly that. To finally enter the guest it calls
into book3s_64_slb.S. On exit it gets jumped at from book3s_64_slb.S too.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - header rename fix
---
 arch/powerpc/include/asm/kvm_ppc.h  |1 +
 arch/powerpc/kvm/book3s_64_interrupts.S |  392 +++
 2 files changed, 393 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_interrupts.S

diff --git a/arch/powerpc/include/asm/kvm_ppc.h 
b/arch/powerpc/include/asm/kvm_ppc.h
index 2c6ee34..269ee46 100644
--- a/arch/powerpc/include/asm/kvm_ppc.h
+++ b/arch/powerpc/include/asm/kvm_ppc.h
@@ -39,6 +39,7 @@ enum emulation_result {
 extern int __kvmppc_vcpu_run(struct kvm_run *kvm_run, struct kvm_vcpu *vcpu);
 extern char kvmppc_handlers_start[];
 extern unsigned long kvmppc_handler_len;
+extern void kvmppc_handler_highmem(void);
 
 extern void kvmppc_dump_vcpu(struct kvm_vcpu *vcpu);
 extern int kvmppc_handle_load(struct kvm_run *run, struct kvm_vcpu *vcpu,
diff --git a/arch/powerpc/kvm/book3s_64_interrupts.S 
b/arch/powerpc/kvm/book3s_64_interrupts.S
new file mode 100644
index 000..7b55d80
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_interrupts.S
@@ -0,0 +1,392 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include asm/ppc_asm.h
+#include asm/kvm_asm.h
+#include asm/reg.h
+#include asm/page.h
+#include asm/asm-offsets.h
+#include asm/exception-64s.h
+
+#define KVMPPC_HANDLE_EXIT .kvmppc_handle_exit
+#define ULONG_SIZE 8
+#define VCPU_GPR(n) (VCPU_GPRS + (n * ULONG_SIZE))
+
+.macro mfpaca tmp_reg, src_reg, offset, vcpu_reg
+   ld  \tmp_reg, (PACA_EXMC+\offset)(r13)
+   std \tmp_reg, VCPU_GPR(\src_reg)(\vcpu_reg)
+.endm
+
+.macro DISABLE_INTERRUPTS
+   mfmsr   r0
+   rldicl  r0,r0,48,1
+   rotldi  r0,r0,16
+   mtmsrd  r0,1
+.endm
+
+/*
+ *   *
+ * Guest entry / exit code that is in kernel module memory (highmem) *
+ *   *
+ /
+
+/* Registers:
+ *  r3: kvm_run pointer
+ *  r4: vcpu pointer
+ */
+_GLOBAL(__kvmppc_vcpu_entry)
+
+kvm_start_entry:
+   /* Write correct stack frame */
+   mflrr0
+   std r0,16(r1)
+
+   /* Save host state to the stack */
+   stdur1, -SWITCH_FRAME_SIZE(r1)
+
+   /* Save r3 (kvm_run) and r4 (vcpu) */
+   SAVE_2GPRS(3, r1)
+
+   /* Save non-volatile registers (r14 - r31) */
+   SAVE_NVGPRS(r1)
+
+   /* Save LR */
+   mflrr14
+   std r14, _LINK(r1)
+
+/* XXX optimize non-volatile loading away */
+kvm_start_lightweight:
+
+   DISABLE_INTERRUPTS
+
+   /* Save R1/R2 in the PACA */
+   std r1, PACAR1(r13)
+   std r2, (PACA_EXMC+EX_SRR0)(r13)
+   ld  r3, VCPU_HIGHMEM_HANDLER(r4)
+   std r3, PACASAVEDMSR(r13)
+
+   /* Load non-volatile guest state from the vcpu */
+   ld  r14, VCPU_GPR(r14)(r4)
+   ld  r15, VCPU_GPR(r15)(r4)
+   ld  r16, VCPU_GPR(r16)(r4)
+   ld  r17, VCPU_GPR(r17)(r4)
+   ld  r18, VCPU_GPR(r18)(r4)
+   ld  r19, VCPU_GPR(r19)(r4)
+   ld  r20, VCPU_GPR(r20)(r4)
+   ld  r21, VCPU_GPR(r21)(r4)
+   ld  r22, VCPU_GPR(r22)(r4)
+   ld  r23, VCPU_GPR(r23)(r4)
+   ld  r24, VCPU_GPR(r24)(r4)
+   ld  r25, VCPU_GPR(r25)(r4)
+   ld  r26, VCPU_GPR(r26)(r4)
+   ld  r27, VCPU_GPR(r27)(r4)
+   ld  r28, VCPU_GPR(r28)(r4)
+   ld  r29, VCPU_GPR(r29)(r4)
+   ld  r30, VCPU_GPR(r30)(r4)
+   ld  r31, VCPU_GPR(r31)(r4)
+
+   ld  r9, VCPU_PC(r4) /* r9 = vcpu-arch.pc */
+   ld  r10, VCPU_SHADOW_MSR(r4)/* r10 = vcpu-arch.shadow_msr 
*/
+
+   ld  r3, VCPU_TRAMPOLINE_ENTER(r4)
+   mtsrr0  r3
+
+   

[PATCH 18/27] Add Book3s_64 offsets to asm-offsets.c

2009-09-29 Thread Alexander Graf
We need to access some VCPU fields from assembly code. In order to get
the proper offsets, we have to define them in asm-offsets.c.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kernel/asm-offsets.c |   13 +
 1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/kernel/asm-offsets.c 
b/arch/powerpc/kernel/asm-offsets.c
index 0812b0f..aba3ea6 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -398,6 +398,19 @@ int main(void)
DEFINE(VCPU_LAST_INST, offsetof(struct kvm_vcpu, arch.last_inst));
DEFINE(VCPU_FAULT_DEAR, offsetof(struct kvm_vcpu, arch.fault_dear));
DEFINE(VCPU_FAULT_ESR, offsetof(struct kvm_vcpu, arch.fault_esr));
+
+   /* book3s_64 */
+#ifdef CONFIG_PPC64
+   DEFINE(VCPU_FAULT_DSISR, offsetof(struct kvm_vcpu, arch.fault_dsisr));
+   DEFINE(VCPU_HOST_RETIP, offsetof(struct kvm_vcpu, arch.host_retip));
+   DEFINE(VCPU_HOST_R2, offsetof(struct kvm_vcpu, arch.host_r2));
+   DEFINE(VCPU_HOST_MSR, offsetof(struct kvm_vcpu, arch.host_msr));
+   DEFINE(VCPU_SHADOW_MSR, offsetof(struct kvm_vcpu, arch.shadow_msr));
+   DEFINE(VCPU_TRAMPOLINE_LOWMEM, offsetof(struct kvm_vcpu, 
arch.trampoline_lowmem));
+   DEFINE(VCPU_TRAMPOLINE_ENTER, offsetof(struct kvm_vcpu, 
arch.trampoline_enter));
+   DEFINE(VCPU_HIGHMEM_HANDLER, offsetof(struct kvm_vcpu, 
arch.highmem_handler));
+   DEFINE(VCPU_HFLAGS, offsetof(struct kvm_vcpu, arch.hflags));
+#endif
 #endif
 #ifdef CONFIG_44x
DEFINE(PGD_T_LOG2, PGD_T_LOG2);
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/27] Export KVM symbols for module

2009-09-29 Thread Alexander Graf
To be able to keep KVM as module, we need to export the SLB trampoline
addresses to the module, so it knows where to jump to.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_exports.c |   24 
 1 files changed, 24 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_exports.c

diff --git a/arch/powerpc/kvm/book3s_64_exports.c 
b/arch/powerpc/kvm/book3s_64_exports.c
new file mode 100644
index 000..5b2db38
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_exports.c
@@ -0,0 +1,24 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include linux/module.h
+#include asm/kvm_book3s.h
+
+EXPORT_SYMBOL_GPL(kvmppc_trampoline_enter);
+EXPORT_SYMBOL_GPL(kvmppc_trampoline_lowmem);
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 20/27] Split init_new_context and destroy_context

2009-09-29 Thread Alexander Graf
For KVM we need to allocate a new context id, but don't really care about
all the mm context around it.

So let's split the alloc and destroy functions for the context id, so we can
grab one without allocating an mm context.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/mmu_context.h |5 +
 arch/powerpc/mm/mmu_context_hash64.c   |   24 +---
 2 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu_context.h 
b/arch/powerpc/include/asm/mmu_context.h
index b34e94d..66b35d0 100644
--- a/arch/powerpc/include/asm/mmu_context.h
+++ b/arch/powerpc/include/asm/mmu_context.h
@@ -23,6 +23,11 @@ extern void switch_slb(struct task_struct *tsk, struct 
mm_struct *mm);
 extern void set_context(unsigned long id, pgd_t *pgd);
 
 #ifdef CONFIG_PPC_BOOK3S_64
+extern int __init_new_context(void);
+extern void __destroy_context(int context_id);
+#endif
+
+#ifdef CONFIG_PPC_BOOK3S_64
 static inline void mmu_context_init(void) { }
 #else
 extern void mmu_context_init(void);
diff --git a/arch/powerpc/mm/mmu_context_hash64.c 
b/arch/powerpc/mm/mmu_context_hash64.c
index dbeb86a..b9e4cc2 100644
--- a/arch/powerpc/mm/mmu_context_hash64.c
+++ b/arch/powerpc/mm/mmu_context_hash64.c
@@ -18,6 +18,7 @@
 #include linux/mm.h
 #include linux/spinlock.h
 #include linux/idr.h
+#include linux/module.h
 
 #include asm/mmu_context.h
 
@@ -32,7 +33,7 @@ static DEFINE_IDR(mmu_context_idr);
 #define NO_CONTEXT 0
 #define MAX_CONTEXT((1UL  19) - 1)
 
-int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+int __init_new_context(void)
 {
int index;
int err;
@@ -57,6 +58,18 @@ again:
return -ENOMEM;
}
 
+   return index;
+}
+EXPORT_SYMBOL_GPL(__init_new_context);
+
+int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+{
+   int index;
+
+   index = __init_new_context();
+   if (index  0)
+   return index;
+
/* The old code would re-promote on fork, we don't do that
 * when using slices as it could cause problem promoting slices
 * that have been forced down to 4K
@@ -68,11 +81,16 @@ again:
return 0;
 }
 
-void destroy_context(struct mm_struct *mm)
+void __destroy_context(int context_id)
 {
spin_lock(mmu_context_lock);
-   idr_remove(mmu_context_idr, mm-context.id);
+   idr_remove(mmu_context_idr, context_id);
spin_unlock(mmu_context_lock);
+}
+EXPORT_SYMBOL_GPL(__destroy_context);
 
+void destroy_context(struct mm_struct *mm)
+{
+   __destroy_context(mm-context.id);
mm-context.id = NO_CONTEXT;
 }
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/27] Add interrupt handling code

2009-09-29 Thread Alexander Graf
Getting from host state to the guest is only half the story. We also need
to return to our host context and handle whatever happened to get us out of
the guest.

On PowerPC every guest exit is an interrupt. So all we need to do is trap
the host's interrupt handlers and get into our #VMEXIT code to handle it.

PowerPCs also have a register that can add an offset to the interrupt handlers'
adresses which is what the booke KVM code uses. Unfortunately that is a
hypervisor ressource and we also want to be able to run KVM when we're running
in an LPAR. So we have to hook into the Linux interrupt handlers.

Signed-off-by: Alexander Graf ag...@suse.de

---

v3 - v4:

  - header rename fix
---
 arch/powerpc/kvm/book3s_64_rmhandlers.S |  131 +++
 1 files changed, 131 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_rmhandlers.S

diff --git a/arch/powerpc/kvm/book3s_64_rmhandlers.S 
b/arch/powerpc/kvm/book3s_64_rmhandlers.S
new file mode 100644
index 000..fb7dd2e
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_rmhandlers.S
@@ -0,0 +1,131 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ *
+ * Copyright SUSE Linux Products GmbH 2009
+ *
+ * Authors: Alexander Graf ag...@suse.de
+ */
+
+#include asm/ppc_asm.h
+#include asm/kvm_asm.h
+#include asm/reg.h
+#include asm/page.h
+#include asm/asm-offsets.h
+#include asm/exception-64s.h
+
+/*
+ *   *
+ *Real Mode handlers that need to be in low physical memory  *
+ *   *
+ /
+
+
+.macro INTERRUPT_TRAMPOLINE intno
+
+.global kvmppc_trampoline_\intno
+kvmppc_trampoline_\intno:
+
+   mtspr   SPRN_SPRG_SCRATCH0, r13 /* Save r13 */
+
+   /*
+* First thing to do is to find out if we're coming
+* from a KVM guest or a Linux process.
+*
+* To distinguish, we check a magic byte in the PACA
+*/
+   mfspr   r13, SPRN_SPRG_PACA /* r13 = PACA */
+   std r12, (PACA_EXMC + EX_R12)(r13)
+   mfcrr12
+   stw r12, (PACA_EXMC + EX_CCR)(r13)
+   lbz r12, PACA_KVM_IN_GUEST(r13)
+   cmpwi   r12, 0
+   bne ..kvmppc_handler_hasmagic_\intno
+   /* No KVM guest? Then jump back to the Linux handler! */
+   lwz r12, (PACA_EXMC + EX_CCR)(r13)
+   mtcrr12
+   ld  r12, (PACA_EXMC + EX_R12)(r13)
+   mfspr   r13, SPRN_SPRG_SCRATCH0 /* r13 = original r13 */
+   b   kvmppc_resume_\intno/* Get back original handler */
+
+   /* Now we know we're handling a KVM guest */
+..kvmppc_handler_hasmagic_\intno:
+   /* Unset guest state */
+   li  r12, 0
+   stb r12, PACA_KVM_IN_GUEST(r13)
+
+   std r1, (PACA_EXMC+EX_R9)(r13)
+   std r10, (PACA_EXMC+EX_R10)(r13)
+   std r11, (PACA_EXMC+EX_R11)(r13)
+   std r2, (PACA_EXMC+EX_R13)(r13)
+
+   mfsrr0  r10
+   mfsrr1  r11
+
+   /* Restore R1/R2 so we can handle faults */
+   ld  r1, PACAR1(r13)
+   ld  r2, (PACA_EXMC+EX_SRR0)(r13)
+
+   /* Let's store which interrupt we're handling */
+   li  r12, \intno
+
+   /* Jump into the SLB exit code that goes to the highmem handler */
+   b   kvmppc_handler_trampoline_exit
+
+.endm
+
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_SYSTEM_RESET
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_MACHINE_CHECK
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_DATA_STORAGE
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_DATA_SEGMENT
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_INST_STORAGE
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_INST_SEGMENT
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_EXTERNAL
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_ALIGNMENT
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_PROGRAM
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_FP_UNAVAIL
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_DECREMENTER
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_SYSCALL
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_TRACE
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_PERFMON
+INTERRUPT_TRAMPOLINE   BOOK3S_INTERRUPT_ALTIVEC
+INTERRUPT_TRAMPOLINE   

[PATCH 22/27] Add fields to PACA

2009-09-29 Thread Alexander Graf
For KVM we need to store some information in the PACA, so we
need to extend it.

This patch adds KVM SLB shadow related entries to the PACA and
a field that indicates if we're inside a guest.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/include/asm/paca.h |9 +
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 7d8514c..5e9b4ef 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -129,6 +129,15 @@ struct paca_struct {
u64 system_time;/* accumulated system TB ticks */
u64 startpurr;  /* PURR/TB value snapshot */
u64 startspurr; /* SPURR value snapshot */
+
+#ifdef CONFIG_KVM_BOOK3S_64_HANDLER
+   struct  {
+   u64 esid;
+   u64 vsid;
+   } kvm_slb[64];  /* guest SLB */
+   u8 kvm_slb_max; /* highest used guest slb entry */
+   u8 kvm_in_guest;/* are we inside the guest? */
+#endif
 };
 
 extern struct paca_struct paca[];
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/27] Add book3s_64 Host MMU handling

2009-09-29 Thread Alexander Graf
We designed the Book3S port of KVM as modular as possible. Most
of the code could be easily used on a Book3S_32 host as well.

The main difference between 32 and 64 bit cores is the MMU. To keep
things well separated, we treat the book3s_64 MMU as one possible compile
option.

This patch adds all the MMU helpers the rest of the code needs in
order to modify the host's MMU, like setting PTEs and segments.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/book3s_64_mmu_host.c |  412 +
 1 files changed, 412 insertions(+), 0 deletions(-)
 create mode 100644 arch/powerpc/kvm/book3s_64_mmu_host.c

diff --git a/arch/powerpc/kvm/book3s_64_mmu_host.c 
b/arch/powerpc/kvm/book3s_64_mmu_host.c
new file mode 100644
index 000..507f770
--- /dev/null
+++ b/arch/powerpc/kvm/book3s_64_mmu_host.c
@@ -0,0 +1,412 @@
+/*
+ * Copyright (C) 2009 SUSE Linux Products GmbH. All rights reserved.
+ *
+ * Authors:
+ * Alexander Graf ag...@suse.de
+ * Kevin Wolf m...@kevin-wolf.de
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
+ */
+
+#include linux/kvm_host.h
+
+#include asm/kvm_ppc.h
+#include asm/kvm_book3s.h
+#include asm/mmu-hash64.h
+#include asm/machdep.h
+#include asm/mmu_context.h
+#include asm/hw_irq.h
+
+#define PTE_SIZE 12
+#define VSID_ALL 0
+
+// #define DEBUG_MMU
+// #define DEBUG_SLB
+
+void kvmppc_mmu_pte_flush(struct kvm_vcpu *vcpu, u64 guest_ea, u64 ea_mask)
+{
+   int i;
+
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing %d Shadow PTEs: 0x%llx  0x%llx\n,
+   vcpu-arch.hpte_cache_offset, guest_ea, ea_mask);
+#endif
+   BUG_ON(vcpu-arch.hpte_cache_offset  HPTEG_CACHE_NUM);
+   guest_ea = ea_mask;
+   for (i=0; ivcpu-arch.hpte_cache_offset; i++) {
+   struct hpte_cache *pte;
+
+   pte = vcpu-arch.hpte_cache[i];
+   if (!pte-host_va)
+   continue;
+
+   if ((pte-pte.eaddr  ea_mask) == guest_ea) {
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing SPT %d: 0x%llx (0x%llx) - 0x%llx\n, 
i, pte-pte.eaddr, pte-pte.vpage, pte-host_va);
+#endif
+   ppc_md.hpte_invalidate(pte-slot, pte-host_va,
+  MMU_PAGE_4K, MMU_SEGSIZE_256M,
+  false);
+   pte-host_va = 0;
+   kvm_release_pfn_dirty(pte-pfn);
+   }
+   }
+
+   /* Doing a complete flush - start from scratch */
+   if (!ea_mask)
+   vcpu-arch.hpte_cache_offset = 0;
+}
+
+void kvmppc_mmu_pte_vflush(struct kvm_vcpu *vcpu, u64 guest_vp, u64 vp_mask)
+{
+   int i;
+
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing %d Shadow vPTEs: 0x%llx  0x%llx\n,
+   vcpu-arch.hpte_cache_offset, guest_vp, vp_mask);
+#endif
+   BUG_ON(vcpu-arch.hpte_cache_offset  HPTEG_CACHE_NUM);
+   guest_vp = vp_mask;
+   for (i=0; ivcpu-arch.hpte_cache_offset; i++) {
+   struct hpte_cache *pte;
+
+   pte = vcpu-arch.hpte_cache[i];
+   if (!pte-host_va)
+   continue;
+
+   if ((pte-pte.vpage  vp_mask) == guest_vp) {
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing SPT %d: 0x%llx (0x%llx) - 0x%llx\n, 
i, pte-pte.eaddr, pte-pte.vpage, pte-host_va);
+#endif
+   ppc_md.hpte_invalidate(pte-slot, pte-host_va,
+  MMU_PAGE_4K, MMU_SEGSIZE_256M,
+  false);
+   pte-host_va = 0;
+   kvm_release_pfn_dirty(pte-pfn);
+   }
+   }
+}
+
+void kvmppc_mmu_pte_pflush(struct kvm_vcpu *vcpu, u64 pa_start, u64 pa_end)
+{
+   int i;
+
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing %d Shadow pPTEs: 0x%llx  0x%llx\n,
+   vcpu-arch.hpte_cache_offset, guest_pa, pa_mask);
+#endif
+   BUG_ON(vcpu-arch.hpte_cache_offset  HPTEG_CACHE_NUM);
+
+   for (i=0; ivcpu-arch.hpte_cache_offset; i++) {
+   struct hpte_cache *pte;
+
+   pte = vcpu-arch.hpte_cache[i];
+   if (!pte-host_va)
+   continue;
+
+   if ((pte-pte.raddr = pa_start)  (pte-pte.raddr  pa_end)) {
+#ifdef DEBUG_MMU
+   printk(KERN_INFO KVM: Flushing SPT %d: 

[PATCH 24/27] Include Book3s_64 target in buildsystem

2009-09-29 Thread Alexander Graf
Now we have everything in place to be able to build KVM, so let's add it
as config option and in the Makefile.

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/Kconfig  |   17 +
 arch/powerpc/kvm/Makefile |   27 +++
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/Kconfig b/arch/powerpc/kvm/Kconfig
index c299268..07703f7 100644
--- a/arch/powerpc/kvm/Kconfig
+++ b/arch/powerpc/kvm/Kconfig
@@ -21,6 +21,23 @@ config KVM
select PREEMPT_NOTIFIERS
select ANON_INODES
 
+config KVM_BOOK3S_64_HANDLER
+   bool
+
+config KVM_BOOK3S_64
+   tristate KVM support for PowerPC book3s_64 processors
+   depends on EXPERIMENTAL  PPC64
+   select KVM
+   select KVM_BOOK3S_64_HANDLER
+   ---help---
+ Support running unmodified book3s_64 and book3s_32 guest kernels
+ in virtual machines on book3s_64 host processors.
+
+ This module provides access to the hardware capabilities through
+ a character device node named /dev/kvm.
+
+ If unsure, say N.
+
 config KVM_440
bool KVM support for PowerPC 440 processors
depends on EXPERIMENTAL  44x
diff --git a/arch/powerpc/kvm/Makefile b/arch/powerpc/kvm/Makefile
index 37655fe..56484d6 100644
--- a/arch/powerpc/kvm/Makefile
+++ b/arch/powerpc/kvm/Makefile
@@ -12,26 +12,45 @@ CFLAGS_44x_tlb.o  := -I.
 CFLAGS_e500_tlb.o := -I.
 CFLAGS_emulate.o  := -I.
 
-kvm-objs := $(common-objs-y) powerpc.o emulate.o
+common-objs-y += powerpc.o emulate.o
 obj-$(CONFIG_KVM_EXIT_TIMING) += timing.o
-obj-$(CONFIG_KVM) += kvm.o
+obj-$(CONFIG_KVM_BOOK3S_64_HANDLER) += book3s_64_exports.o
 
 AFLAGS_booke_interrupts.o := -I$(obj)
 
 kvm-440-objs := \
+   $(common-objs-y) \
booke.o \
booke_emulate.o \
booke_interrupts.o \
44x.o \
44x_tlb.o \
44x_emulate.o
-obj-$(CONFIG_KVM_440) += kvm-440.o
+kvm-objs-$(CONFIG_KVM_440) := $(kvm-440-objs)
 
 kvm-e500-objs := \
+   $(common-objs-y) \
booke.o \
booke_emulate.o \
booke_interrupts.o \
e500.o \
e500_tlb.o \
e500_emulate.o
-obj-$(CONFIG_KVM_E500) += kvm-e500.o
+kvm-objs-$(CONFIG_KVM_E500) := $(kvm-e500-objs)
+
+kvm-book3s_64-objs := \
+   $(common-objs-y) \
+   book3s.o \
+   book3s_64_emulate.o \
+   book3s_64_interrupts.o \
+   book3s_64_mmu_host.o \
+   book3s_64_mmu.o \
+   book3s_32_mmu.o
+kvm-objs-$(CONFIG_KVM_BOOK3S_64) := $(kvm-book3s_64-objs)
+
+kvm-objs := $(kvm-objs-m) $(kvm-objs-y)
+
+obj-$(CONFIG_KVM_440) += kvm.o
+obj-$(CONFIG_KVM_E500) += kvm.o
+obj-$(CONFIG_KVM_BOOK3S_64) += kvm.o
+
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 25/27] Fix trace.h

2009-09-29 Thread Alexander Graf
It looks like the variable pc is defined. At least the current code always
failed on me stating that pc is already defined somewhere else.

Let's use _pc instead, because that doesn't collide.

Is this the right approach? Does it break on 440 too? If not, why not?

Signed-off-by: Alexander Graf ag...@suse.de
---
 arch/powerpc/kvm/trace.h |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/trace.h b/arch/powerpc/kvm/trace.h
index 67f219d..a8e8400 100644
--- a/arch/powerpc/kvm/trace.h
+++ b/arch/powerpc/kvm/trace.h
@@ -12,8 +12,8 @@
  * Tracepoint for guest mode entry.
  */
 TRACE_EVENT(kvm_ppc_instr,
-   TP_PROTO(unsigned int inst, unsigned long pc, unsigned int emulate),
-   TP_ARGS(inst, pc, emulate),
+   TP_PROTO(unsigned int inst, unsigned long _pc, unsigned int emulate),
+   TP_ARGS(inst, _pc, emulate),
 
TP_STRUCT__entry(
__field(unsigned int,   inst)
@@ -23,7 +23,7 @@ TRACE_EVENT(kvm_ppc_instr,
 
TP_fast_assign(
__entry-inst   = inst;
-   __entry-pc = pc;
+   __entry-pc = _pc;
__entry-emulate= emulate;
),
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 27/27] Use Little Endian for Dirty Bitmap

2009-09-29 Thread Alexander Graf
We currently use host endian long types to store information
in the dirty bitmap.

This works reasonably well on Little Endian targets, because the
u32 after the first contains the next 32 bits. On Big Endian this
breaks completely though, forcing us to be inventive here.

So Ben suggested to always use Little Endian, which looks reasonable.

We only have dirty bitmap implemented in Little Endian targets so far
and since PowerPC would be the first Big Endian platform, we can just
as well switch to Little Endian always with little effort without
breaking existing targets.

Signed-off-by: Alexander Graf ag...@suse.de
---
 virt/kvm/kvm_main.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 91c0225..5aaa67c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -48,6 +48,7 @@
 #include asm/io.h
 #include asm/uaccess.h
 #include asm/pgtable.h
+#include asm-generic/bitops/le.h
 
 #ifdef KVM_COALESCED_MMIO_PAGE_OFFSET
 #include coalesced_mmio.h
@@ -1062,8 +1063,8 @@ void mark_page_dirty(struct kvm *kvm, gfn_t gfn)
unsigned long rel_gfn = gfn - memslot-base_gfn;
 
/* avoid RMW */
-   if (!test_bit(rel_gfn, memslot-dirty_bitmap))
-   set_bit(rel_gfn, memslot-dirty_bitmap);
+   if (!generic_test_le_bit(rel_gfn, memslot-dirty_bitmap))
+   generic___set_le_bit(rel_gfn, memslot-dirty_bitmap);
}
 }
 
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 26/27] Enable 32bit dirty log pointers on 64bit host

2009-09-29 Thread Alexander Graf
With big endian userspace, we can't quite figure out if a pointer
is 32 bit (shifted  32) or 64 bit when we read a 64 bit pointer.

This is what happens with dirty logging. To get the pointer interpreted
correctly, I just make it bounce twice, but admittedly that is not ideal.

I'm open for suggestions here.

Signed-off-by: Alexander Graf ag...@suse.de
---
 virt/kvm/kvm_main.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e27b7a9..91c0225 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -720,6 +720,11 @@ int kvm_get_dirty_log(struct kvm *kvm,
 
r = -EFAULT;
if (copy_to_user(log-dirty_bitmap, memslot-dirty_bitmap, n))
+#ifdef __BIG_ENDIAN
+   /* Did we get a 32 bit pointer? */
+   if (copy_to_user((void*)((u64)log-dirty_bitmap  32),
+memslot-dirty_bitmap, n))
+#endif
goto out;
 
if (any)
-- 
1.6.0.2

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host

2009-09-29 Thread Alexander Graf


On 29.09.2009, at 11:14, Avi Kivity wrote:


On 09/29/2009 10:18 AM, Alexander Graf wrote:

With big endian userspace, we can't quite figure out if a pointer
is 32 bit (shifted  32) or 64 bit when we read a 64 bit pointer.

This is what happens with dirty logging. To get the pointer  
interpreted
correctly, I just make it bounce twice, but admittedly that is not  
ideal.


I'm open for suggestions here.




How about adding a new union member to struct kvm_dirty_log:

 __u64 dirty_bitmap_virt;


And modifying userspace to write to that one?

Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux-next: tree build failure

2009-09-29 Thread Jan Beulich
 Hollis Blanchard  09/29/09 2:00 AM 
First, I think there is a real bug here, and the code should read like
this (to match the comment):
/* type has to be known at build time for optimization */
-BUILD_BUG_ON(__builtin_constant_p(type));
+BUILD_BUG_ON(!__builtin_constant_p(type));

However, I get the same build error *both* ways, i.e.
__builtin_constant_p(type) evaluates to both 0 and 1? Either that, or
the new BUILD_BUG_ON() macro isn't working...

No, at this point of the compilation process it's neither zero nor one,
it's simply considered non-constant by the compiler at that stage
(this builtin is used for optimization, not during parsing, and the
error gets generated when the body of the function gets parsed,
not when code gets generated from it).

Jan

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: linux-next: tree build failure

2009-09-29 Thread roel kluin
On Tue, Sep 29, 2009 at 11:28 AM, Jan Beulich jbeul...@novell.com wrote:
 Hollis Blanchard  09/29/09 2:00 AM 
First, I think there is a real bug here, and the code should read like
this (to match the comment):
    /* type has to be known at build time for optimization */
-    BUILD_BUG_ON(__builtin_constant_p(type));
+    BUILD_BUG_ON(!__builtin_constant_p(type));

However, I get the same build error *both* ways, i.e.
__builtin_constant_p(type) evaluates to both 0 and 1? Either that, or
the new BUILD_BUG_ON() macro isn't working...

 No, at this point of the compilation process it's neither zero nor one,
 it's simply considered non-constant by the compiler at that stage
 (this builtin is used for optimization, not during parsing, and the
 error gets generated when the body of the function gets parsed,
 not when code gets generated from it).

 Jan

then maybe

if(__builtin_constant_p(type))
BUILD_BUG_ON(1);

would work?

Roel
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host

2009-09-29 Thread Avi Kivity

On 09/29/2009 11:17 AM, Alexander Graf wrote:


On 29.09.2009, at 11:14, Avi Kivity wrote:


On 09/29/2009 10:18 AM, Alexander Graf wrote:

With big endian userspace, we can't quite figure out if a pointer
is 32 bit (shifted 32) or 64 bit when we read a 64 bit pointer.

This is what happens with dirty logging. To get the pointer interpreted
correctly, I just make it bounce twice, but admittedly that is not
ideal.

I'm open for suggestions here.




How about adding a new union member to struct kvm_dirty_log:

__u64 dirty_bitmap_virt;


And modifying userspace to write to that one?



Yes - old userspace will still build and work (we don't remove the old 
field) on little endian or BE32, new userspace will work on all 
flavours.  We need new userspace anyway to take advantage of dirty logging.


--
error compiling committee.c: too many arguments to function
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host

2009-09-29 Thread Alexander Graf


Am 29.09.2009 um 06:25 schrieb Avi Kivity a...@redhat.com:


On 09/29/2009 11:17 AM, Alexander Graf wrote:


On 29.09.2009, at 11:14, Avi Kivity wrote:


On 09/29/2009 10:18 AM, Alexander Graf wrote:

With big endian userspace, we can't quite figure out if a pointer
is 32 bit (shifted 32) or 64 bit when we read a 64 bit pointer.

This is what happens with dirty logging. To get the pointer  
interpreted

correctly, I just make it bounce twice, but admittedly that is not
ideal.

I'm open for suggestions here.




How about adding a new union member to struct kvm_dirty_log:

__u64 dirty_bitmap_virt;


And modifying userspace to write to that one?



Yes - old userspace will still build and work (we don't remove the  
old field) on little endian or BE32, new userspace will work on all  
flavours.  We need new userspace anyway to take advantage of dirty  
logging.


Uh, the dirty logging bits are in place already IIRC :)

But yes, sounds like the cleaner way to do it.

Alex




--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host

2009-09-29 Thread Alexander Graf


On 29.09.2009, at 15:25, Avi Kivity wrote:


On 09/29/2009 11:17 AM, Alexander Graf wrote:


On 29.09.2009, at 11:14, Avi Kivity wrote:


On 09/29/2009 10:18 AM, Alexander Graf wrote:

With big endian userspace, we can't quite figure out if a pointer
is 32 bit (shifted 32) or 64 bit when we read a 64 bit pointer.

This is what happens with dirty logging. To get the pointer  
interpreted

correctly, I just make it bounce twice, but admittedly that is not
ideal.

I'm open for suggestions here.




How about adding a new union member to struct kvm_dirty_log:

__u64 dirty_bitmap_virt;


And modifying userspace to write to that one?



Yes - old userspace will still build and work (we don't remove the  
old field) on little endian or BE32, new userspace will work on all  
flavours.  We need new userspace anyway to take advantage of dirty  
logging.


How about this one? (broken whitespace!)

From c3864a2c5e1fccff7839e47f12c09d9739ca441e Mon Sep 17 00:00:00 2001
From: Alexander Graf ag...@suse.de
Date: Thu, 23 Jul 2009 21:05:57 +0200
Subject: [PATCH] Enable 32bit dirty log pointers on 64bit host

With big endian userspace, passing a pointer from 32-bit userspace to
64-bit kernel space breaks.

This is what happens with dirty logging. To get the pointer interpreted
correctly, we can just check the guest's 32bit flag and treat the  
pointer

as 32 bits then.

Signed-off-by: Alexander Graf ag...@suse.de

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e27b7a9..00f2c59 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -703,6 +703,7 @@ int kvm_get_dirty_log(struct kvm *kvm,
int r, i;
int n;
unsigned long any = 0;
+   void *target_bm;

r = -EINVAL;
if (log-slot = KVM_MEMORY_SLOTS)
@@ -718,8 +719,15 @@ int kvm_get_dirty_log(struct kvm *kvm,
for (i = 0; !any  i  n/sizeof(long); ++i)
any = memslot-dirty_bitmap[i];

+#if defined(__BIG_ENDIAN)  defined(CONFIG_64BIT)
+   /* Need to convert user pointers */
+   if (test_thread_flag(TIF_32BIT))
+   target_bm = (void*)((u64)log-dirty_bitmap  32);
+   else
+#endif
+   target_bm = log-dirty_bitmap;
r = -EFAULT;
-   if (copy_to_user(log-dirty_bitmap, memslot-dirty_bitmap, n))
+   if (copy_to_user(target_bm, memslot-dirty_bitmap, n))
goto out;

if (any)
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host

2009-09-29 Thread Avi Kivity

On 09/29/2009 06:29 PM, Alexander Graf wrote:


How about this one? (broken whitespace!)

From c3864a2c5e1fccff7839e47f12c09d9739ca441e Mon Sep 17 00:00:00 2001
From: Alexander Graf ag...@suse.de
Date: Thu, 23 Jul 2009 21:05:57 +0200
Subject: [PATCH] Enable 32bit dirty log pointers on 64bit host

With big endian userspace, passing a pointer from 32-bit userspace to
64-bit kernel space breaks.

This is what happens with dirty logging. To get the pointer interpreted
correctly, we can just check the guest's 32bit flag and treat the pointer
as 32 bits then.

Signed-off-by: Alexander Graf ag...@suse.de

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e27b7a9..00f2c59 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -703,6 +703,7 @@ int kvm_get_dirty_log(struct kvm *kvm,
int r, i;
int n;
unsigned long any = 0;
+   void *target_bm;


void __user *target_bm;



r = -EINVAL;
if (log-slot = KVM_MEMORY_SLOTS)
@@ -718,8 +719,15 @@ int kvm_get_dirty_log(struct kvm *kvm,
for (i = 0; !any  i  n/sizeof(long); ++i)
any = memslot-dirty_bitmap[i];

+#if defined(__BIG_ENDIAN)  defined(CONFIG_64BIT)
+   /* Need to convert user pointers */
+   if (test_thread_flag(TIF_32BIT))
+   target_bm = (void*)((u64)log-dirty_bitmap  32);
+   else
+#endif
+   target_bm = log-dirty_bitmap;
r = -EFAULT;
-   if (copy_to_user(log-dirty_bitmap, memslot-dirty_bitmap, n))
+   if (copy_to_user(target_bm, memslot-dirty_bitmap, n))
goto out;

if (any)


Ah, that's much better.  Plus a mental note not to put pointers in 
user-visible structures in the future.  This can serve as a reminder :)


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 26/27] Enable 32bit dirty log pointers on 64bit host

2009-09-29 Thread Alexander Graf


On 29.09.2009, at 18:42, Avi Kivity wrote:


On 09/29/2009 06:29 PM, Alexander Graf wrote:


How about this one? (broken whitespace!)

From c3864a2c5e1fccff7839e47f12c09d9739ca441e Mon Sep 17 00:00:00  
2001

From: Alexander Graf ag...@suse.de
Date: Thu, 23 Jul 2009 21:05:57 +0200
Subject: [PATCH] Enable 32bit dirty log pointers on 64bit host

With big endian userspace, passing a pointer from 32-bit userspace to
64-bit kernel space breaks.

This is what happens with dirty logging. To get the pointer  
interpreted
correctly, we can just check the guest's 32bit flag and treat the  
pointer

as 32 bits then.

Signed-off-by: Alexander Graf ag...@suse.de

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index e27b7a9..00f2c59 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -703,6 +703,7 @@ int kvm_get_dirty_log(struct kvm *kvm,
   int r, i;
   int n;
   unsigned long any = 0;
+   void *target_bm;


void __user *target_bm;


k, done.





   r = -EINVAL;
   if (log-slot = KVM_MEMORY_SLOTS)
@@ -718,8 +719,15 @@ int kvm_get_dirty_log(struct kvm *kvm,
   for (i = 0; !any  i  n/sizeof(long); ++i)
   any = memslot-dirty_bitmap[i];

+#if defined(__BIG_ENDIAN)  defined(CONFIG_64BIT)
+   /* Need to convert user pointers */
+   if (test_thread_flag(TIF_32BIT))
+   target_bm = (void*)((u64)log-dirty_bitmap  32);
+   else
+#endif
+   target_bm = log-dirty_bitmap;
   r = -EFAULT;
-   if (copy_to_user(log-dirty_bitmap, memslot-dirty_bitmap,  
n))

+   if (copy_to_user(target_bm, memslot-dirty_bitmap, n))
   goto out;

   if (any)


Ah, that's much better.  Plus a mental note not to put pointers in  
user-visible structures in the future.  This can serve as a  
reminder :)


Heh, yeah :). At times like this I see the benefits of little endian...

The new code is in git://csgraf.de/kvm branch ppc-v5.
(Don't expect to pull updates from that branch until I release it - I  
will push -f in there, breaking git pull.)


Alex
--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html