Re: [PATCH] qemu: Fix inject-nmi

2011-10-10 Thread Lai Jiangshan
On 09/26/2011 04:21 PM, Avi Kivity wrote:
 On 09/25/2011 08:22 PM, Jan Kiszka wrote:
 On 2011-09-25 16:07, Avi Kivity wrote:
   On 09/23/2011 12:31 PM, Lai Jiangshan wrote:
  Moreover: wrong indention.
   
  You know that this won't work for qemu-kvm with in-kernel irqchip? 
  You
  may want to provide a patch for that tree, emulating the unavailable
  LINT1 injection via testing the APIC configration and then raising an
  NMI as before if it is accepted.
   
 
   It works in my box but the NMI is not injected through the in-kernel
   irqchip,
   I will implement it as you suggested.
 
   Somewhat hacky; isn't it better to test LINT1 in the kernel (and
   redefine the KVM_NMI ioctl as toggle LINT1)?

 KVM_NMI is required for user space IRQ chip as well.
 
 We could define KVM_NMI as edging the core NMI input if !irqchip_in_kernel, 
 and toggling LINT1 otherwise.  Hardly nice though.
 
 The current KVM_NMI with irqchip_in_kernel is not meaningful, since it 
 doesn't obey the rules of any NMI source.
 
 Introducing some KVM_SET_LINT1 is an option though. But emulating it for
 the NMI button on older kernels sounds worthwhile nevertheless.

 
 Perhaps this is the best option to avoid confusion.
 

(add cc: seab...@seabios.org)

Hi, All,

When I was implementing KVM_SET_LINT1, I found many places of
the qemu-kvm code need to be changed, and it became nasty.

And as Avi said KVM_NMI with irqchip_in_kernel is not meaningful,
so KVM_NMI is not used anymore when KVM_SET_LINT1  irqchip_in_kernel,
it is dead.

Now, we redefine KVM_NMI with more proper meaning, when irqchip_in_kernel,
it is kernel/kvm's responsibility to simulate the NMI-injection and set LINT1.
When !irqchip_in_kernel, it is userspace's responsibility.

It results more real simulation and results simpler code,
and it don't need to add new ioctl interface,
and it can make use of existing KVM_NMI.

Thanks,
Lai
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] kernel/kvm: fix improper nmi emulation (was: Re: [Qemu-devel] [PATCH] qemu: Fix inject-nmi)

2011-10-10 Thread Lai Jiangshan
From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com

Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
button event happens. This doesn't properly emulate real hardware on
which NMI button event triggers LINT1. Because of this, NMI is sent to
the processor even when LINT1 is maskied in LVT. For example, this
causes the problem that kdump initiated by NMI sometimes doesn't work
on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.

With this patch, KVM_NMI ioctl is handled as follows.

- When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a
  request of triggering LINT1 on the processor. LINT1 is emulated in
  in-kernel irqchip.

- When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a
  request of injecting NMI to the processor. This assumes LINT1 is
  already emulated in userland.

Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com
Tested-by: Lai Jiangshan la...@cn.fujitsu.com
---
 arch/x86/kvm/irq.h   |1 +
 arch/x86/kvm/lapic.c |8 
 arch/x86/kvm/x86.c   |   14 --
 3 files changed, 13 insertions(+), 10 deletions(-)

Index: linux/arch/x86/kvm/irq.h
===
--- linux.orig/arch/x86/kvm/irq.h
+++ linux/arch/x86/kvm/irq.h
@@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state
 void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
 void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
 void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
+void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
 void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
 void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
 void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
Index: linux/arch/x86/kvm/lapic.c
===
--- linux.orig/arch/x86/kvm/lapic.c
+++ linux/arch/x86/kvm/lapic.c
@@ -1039,6 +1039,14 @@ void kvm_apic_nmi_wd_deliver(struct kvm_
kvm_apic_local_deliver(apic, APIC_LVT0);
 }
 
+void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
+{
+   struct kvm_lapic *apic = vcpu-arch.apic;
+
+   if (apic)
+   kvm_apic_local_deliver(apic, APIC_LVT1);
+}
+
 static struct kvm_timer_ops lapic_timer_ops = {
.is_periodic = lapic_is_periodic,
 };
Index: linux/arch/x86/kvm/x86.c
===
--- linux.orig/arch/x86/kvm/x86.c
+++ linux/arch/x86/kvm/x86.c
@@ -2729,13 +2729,6 @@ static int kvm_vcpu_ioctl_interrupt(stru
return 0;
 }
 
-static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu)
-{
-   kvm_inject_nmi(vcpu);
-
-   return 0;
-}
-
 static int vcpu_ioctl_tpr_access_reporting(struct kvm_vcpu *vcpu,
   struct kvm_tpr_access_ctl *tac)
 {
@@ -3038,9 +3031,10 @@ long kvm_arch_vcpu_ioctl(struct file *fi
break;
}
case KVM_NMI: {
-   r = kvm_vcpu_ioctl_nmi(vcpu);
-   if (r)
-   goto out;
+   if (irqchip_in_kernel(vcpu-kvm))
+   kvm_apic_lint1_deliver(vcpu);
+   else
+   kvm_inject_nmi(vcpu);
r = 0;
break;
}
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] qemu-kvm: fix improper nmi emulation (was: Re: [Qemu-devel] [PATCH] qemu: Fix inject-nmi)

2011-10-10 Thread Lai Jiangshan
From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com

Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
button event happens. This doesn't properly emulate real hardware on
which NMI button event triggers LINT1. Because of this, NMI is sent to
the processor even when LINT1 is maskied in LVT. For example, this
causes the problem that kdump initiated by NMI sometimes doesn't work
on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.

With this patch, inject-nmi request is handled as follows.

- When in-kernel irqchip is disabled, inject LINT1 instead of NMI
  interrupt.
- When in-kernel irqchip is enabled, send nmi event to kernel as the
  current code does. LINT1 should be emulated in kernel.

Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com
Tested-by: Lai Jiangshan la...@cn.fujitsu.com
---
 hw/apic.c |   16 
 hw/apic.h |1 +
 monitor.c |5 ++---
 3 files changed, 19 insertions(+), 3 deletions(-)

Index: qemu-kvm/hw/apic.c
===
--- qemu-kvm.orig/hw/apic.c
+++ qemu-kvm/hw/apic.c
@@ -205,6 +205,22 @@ void apic_deliver_pic_intr(DeviceState *
 }
 }
 
+void apic_deliver_nmi(CPUState *env)
+{
+APICState *apic;
+
+if (kvm_enabled()  kvm_irqchip_in_kernel()) {
+cpu_interrupt(env, CPU_INTERRUPT_NMI);
+   return;
+}
+
+apic = DO_UPCAST(APICState, busdev.qdev, env-apic_state);
+if (!apic)
+cpu_interrupt(env, CPU_INTERRUPT_NMI);
+else
+apic_local_deliver(apic, APIC_LVT_LINT1);
+}
+
 #define foreach_apic(apic, deliver_bitmask, code) \
 {\
 int __i, __j, __mask;\
Index: qemu-kvm/hw/apic.h
===
--- qemu-kvm.orig/hw/apic.h
+++ qemu-kvm/hw/apic.h
@@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint
  uint8_t trigger_mode);
 int apic_accept_pic_intr(DeviceState *s);
 void apic_deliver_pic_intr(DeviceState *s, int level);
+void apic_deliver_nmi(CPUState *env);
 int apic_get_interrupt(DeviceState *s);
 void apic_reset_irq_delivered(void);
 int apic_get_irq_delivered(void);
Index: qemu-kvm/monitor.c
===
--- qemu-kvm.orig/monitor.c
+++ qemu-kvm/monitor.c
@@ -2615,9 +2615,8 @@ static int do_inject_nmi(Monitor *mon, c
 {
 CPUState *env;
 
-for (env = first_cpu; env != NULL; env = env-next_cpu) {
-cpu_interrupt(env, CPU_INTERRUPT_NMI);
-}
+for (env = first_cpu; env != NULL; env = env-next_cpu)
+   apic_deliver_nmi(env);
 
 return 0;
 }
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] seabios: Add Local APIC NMI Structure to ACPI MADT (was: Re: [Qemu-devel] [PATCH] qemu: Fix inject-nmi)

2011-10-10 Thread Lai Jiangshan
From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com

ACPI NMI Structure describes LINT pin (LINT0 or LINT1) information to
which NMI is connected, and it is needed by OS to initialize local APIC.

Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com
Reviewed-by: Lai Jiangshan la...@cn.fujitsu.com
---
 src/acpi.c |   22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

Index: seabios/src/acpi.c
===
--- seabios.orig/src/acpi.c
+++ seabios/src/acpi.c
@@ -134,6 +134,14 @@ struct madt_intsrcovr {
 u16 flags;
 } PACKED;
 
+struct madt_local_nmi {
+ACPI_SUB_HEADER_DEF
+u8  processor_id;   /* ACPI processor id */
+u16 flags;  /* MPS INTI flags */
+u8  lint;   /* Local APIC LINT# */
+} PACKED;
+
+
 /*
  * ACPI 2.0 Generic Address Space definition.
  */
@@ -288,7 +296,9 @@ build_madt(void)
 int madt_size = (sizeof(struct multiple_apic_table)
  + sizeof(struct madt_processor_apic) * MaxCountCPUs
  + sizeof(struct madt_io_apic)
- + sizeof(struct madt_intsrcovr) * 16);
+ + sizeof(struct madt_intsrcovr) * 16
+ + sizeof(struct madt_local_nmi));
+
 struct multiple_apic_table *madt = malloc_high(madt_size);
 if (!madt) {
 warn_noalloc();
@@ -340,7 +350,15 @@ build_madt(void)
 intsrcovr++;
 }
 
-build_header((void*)madt, APIC_SIGNATURE, (void*)intsrcovr - (void*)madt, 
1);
+struct madt_local_nmi *local_nmi = (void*)intsrcovr;
+local_nmi-type = APIC_LOCAL_NMI;
+local_nmi-length   = sizeof(*local_nmi);
+local_nmi-processor_id = 0xff; /* all processors */
+local_nmi-flags= 0;
+local_nmi-lint = 1; /* LINT1 */
+local_nmi++;
+
+build_header((void*)madt, APIC_SIGNATURE, (void*)local_nmi - (void*)madt, 
1);
 return madt;
 }
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] seabios: fix mptable nmi entry (was: Re: [Qemu-devel] [PATCH] qemu: Fix inject-nmi)

2011-10-10 Thread Lai Jiangshan
From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com

In the current seabios MP table description, NMI is connected only to
BSP's LINT1. But usually NMI is connected to all the CPUs' LINT1 as
indicated in MP specification. This patch changes seabios MP table to
describe NMI is connected to all the CPUs' LINT1.

Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com
Reviewed-by: Lai Jiangshan la...@cn.fujitsu.com
---
 src/mptable.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: seabios/src/mptable.c
===
--- seabios.orig/src/mptable.c
+++ seabios/src/mptable.c
@@ -169,7 +169,7 @@ mptable_init(void)
 intsrc-irqflag = 0; /* PO, EL default */
 intsrc-srcbus = isabusid; /* ISA */
 intsrc-srcbusirq = 0;
-intsrc-dstapic = 0; /* BSP == APIC #0 */
+intsrc-dstapic = 0xff; /* to all local APICs */
 intsrc-dstirq = 1; /* LINTIN1 */
 intsrc++;
 entrycount += intsrc - intsrcs;
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kernel/kvm: fix improper nmi emulation

2011-10-10 Thread Jan Kiszka
On 2011-10-10 08:06, Lai Jiangshan wrote:
 From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com
 
 Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
 button event happens. This doesn't properly emulate real hardware on
 which NMI button event triggers LINT1. Because of this, NMI is sent to
 the processor even when LINT1 is maskied in LVT. For example, this
 causes the problem that kdump initiated by NMI sometimes doesn't work
 on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
 
 With this patch, KVM_NMI ioctl is handled as follows.
 
 - When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a
   request of triggering LINT1 on the processor. LINT1 is emulated in
   in-kernel irqchip.
 
 - When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a
   request of injecting NMI to the processor. This assumes LINT1 is
   already emulated in userland.
 
 Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com
 Tested-by: Lai Jiangshan la...@cn.fujitsu.com
 ---
  arch/x86/kvm/irq.h   |1 +
  arch/x86/kvm/lapic.c |8 
  arch/x86/kvm/x86.c   |   14 --
  3 files changed, 13 insertions(+), 10 deletions(-)
 
 Index: linux/arch/x86/kvm/irq.h
 ===
 --- linux.orig/arch/x86/kvm/irq.h
 +++ linux/arch/x86/kvm/irq.h
 @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state
  void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu);
  void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu);
  void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu);
 +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu);
  void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu);
  void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu);
  void __kvm_migrate_timers(struct kvm_vcpu *vcpu);
 Index: linux/arch/x86/kvm/lapic.c
 ===
 --- linux.orig/arch/x86/kvm/lapic.c
 +++ linux/arch/x86/kvm/lapic.c
 @@ -1039,6 +1039,14 @@ void kvm_apic_nmi_wd_deliver(struct kvm_
   kvm_apic_local_deliver(apic, APIC_LVT0);
  }
  
 +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu)
 +{
 + struct kvm_lapic *apic = vcpu-arch.apic;
 +
 + if (apic)

WARN_ON(!apic)? Looks like that case would be a kernel bug.

 + kvm_apic_local_deliver(apic, APIC_LVT1);
 +}
 +
  static struct kvm_timer_ops lapic_timer_ops = {
   .is_periodic = lapic_is_periodic,
  };
 Index: linux/arch/x86/kvm/x86.c
 ===
 --- linux.orig/arch/x86/kvm/x86.c
 +++ linux/arch/x86/kvm/x86.c
 @@ -2729,13 +2729,6 @@ static int kvm_vcpu_ioctl_interrupt(stru
   return 0;
  }
  
 -static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu)
 -{
 - kvm_inject_nmi(vcpu);
 -
 - return 0;
 -}
 -
  static int vcpu_ioctl_tpr_access_reporting(struct kvm_vcpu *vcpu,
  struct kvm_tpr_access_ctl *tac)
  {
 @@ -3038,9 +3031,10 @@ long kvm_arch_vcpu_ioctl(struct file *fi
   break;
   }
   case KVM_NMI: {
 - r = kvm_vcpu_ioctl_nmi(vcpu);
 - if (r)
 - goto out;
 + if (irqchip_in_kernel(vcpu-kvm))
 + kvm_apic_lint1_deliver(vcpu);
 + else
 + kvm_inject_nmi(vcpu);
   r = 0;
   break;
   }

Looks OK otherwise.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [RFC PATCH 5/7] [hyper-v] hyper-v helper functions

2011-10-10 Thread Vadim Rozenfeld
On Sun, 2011-10-09 at 21:01 +0200, Alon Levy wrote:
 On Sun, Oct 09, 2011 at 08:52:53PM +0200, Vadim Rozenfeld wrote:
  ---
   hyperv.c |   44 
   hyperv.h |7 +++
   2 files changed, 51 insertions(+), 0 deletions(-)
  
  diff --git a/hyperv.c b/hyperv.c
  index a17f879..57915b9 100644
  --- a/hyperv.c
  +++ b/hyperv.c
  @@ -3,6 +3,10 @@
   #include qemu-option.h
   #include qemu-config.h
   
  +static int hyperv_apic;
  +static int hyperv_wd;
  +static int hyperv_spinlock_attempts = HYPERV_SPINLOCK_NEVER_RETRY;
  +
   void hyperv_init(void)
   {
   QemuOpts *opts = QTAILQ_FIRST(qemu_hyperv_opts.head);
  @@ -10,6 +14,46 @@ void hyperv_init(void)
   if (!opts) {
   return;
   }
  +
  +hyperv_spinlock_attempts = qemu_opt_get_number(opts, spinlock, 
  +   
  HYPERV_SPINLOCK_NEVER_RETRY
  +  );
  +hyperv_wd = qemu_opt_get_bool(opts, wd, 0);
  +hyperv_apic = qemu_opt_get_bool(opts, vapic, 0);
  +
  +}
  +
  +int hyperv_enabled(void)
  +{
  +return hyperv_hypercall_available() | hyperv_relaxed_timing();
 Shouldn't this be a logical or?
Sure, thanks.
 
  +}
  +
  +int hyperv_hypercall_available(void)
  +{
  +if (hyperv_apic ||
  +(hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_RETRY)) {
  +  return 1;
  +}
  +return 0;
  +}
  +
  +int hyperv_relaxed_timing(void)
  +{
  +return !hyperv_wd;
  +}
  +
  +int hyperv_apic_recommended(void)
  +{
  +#ifdef KVM_CAP_IRQCHIP
  +return hyperv_apic;
  +#else
  +return 0;
  +#endif
  +}
  +
  +int hyperv_spinlock_retries(void)
  +{
  +return hyperv_spinlock_attempts;
   }
   
   static void hyperv_initialize(void)
  diff --git a/hyperv.h b/hyperv.h
  index eaf974a..27d2e6e 100644
  --- a/hyperv.h
  +++ b/hyperv.h
  @@ -6,7 +6,14 @@
   
   #include asm/hyperv.h
   
  +#define HYPERV_SPINLOCK_NEVER_RETRY 0x
  +
   void hyperv_init(void);
  +int hyperv_enabled(void);
  +int hyperv_hypercall_available(void);
  +int hyperv_relaxed_timing(void);
  +int hyperv_apic_recommended(void);
  +int hyperv_spinlock_retries(void);
   
   #endif /* QEMU_HW_HYPERV_H */
   
  -- 
  1.7.4.4
  
  --
  To unsubscribe from this list: send the line unsubscribe kvm in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu-kvm: fix improper nmi emulation

2011-10-10 Thread Jan Kiszka
On 2011-10-10 08:06, Lai Jiangshan wrote:
 From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com
 
 Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
 button event happens. This doesn't properly emulate real hardware on
 which NMI button event triggers LINT1. Because of this, NMI is sent to
 the processor even when LINT1 is maskied in LVT. For example, this
 causes the problem that kdump initiated by NMI sometimes doesn't work
 on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.
 
 With this patch, inject-nmi request is handled as follows.
 
 - When in-kernel irqchip is disabled, inject LINT1 instead of NMI
   interrupt.
 - When in-kernel irqchip is enabled, send nmi event to kernel as the
   current code does. LINT1 should be emulated in kernel.
 
 Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com
 Tested-by: Lai Jiangshan la...@cn.fujitsu.com

This is targeting uq/master?

Please make sure your patch passes checkpatch.pl

 ---
  hw/apic.c |   16 
  hw/apic.h |1 +
  monitor.c |5 ++---
  3 files changed, 19 insertions(+), 3 deletions(-)
 
 Index: qemu-kvm/hw/apic.c
 ===
 --- qemu-kvm.orig/hw/apic.c
 +++ qemu-kvm/hw/apic.c
 @@ -205,6 +205,22 @@ void apic_deliver_pic_intr(DeviceState *
  }
  }
  
 +void apic_deliver_nmi(CPUState *env)
 +{
 +APICState *apic;
 +
 +if (kvm_enabled()  kvm_irqchip_in_kernel()) {
 +cpu_interrupt(env, CPU_INTERRUPT_NMI);
 + return;
 +}
 +
 +apic = DO_UPCAST(APICState, busdev.qdev, env-apic_state);
 +if (!apic)
 +cpu_interrupt(env, CPU_INTERRUPT_NMI);

Testing for !apic and handling the non-APIC case here looks a bit
strange. Let's move the !env-apic_state test to the caller to make it
consistent with other APIC services.

The KVM case should be a separate qemu-kvm patch on top for now. (We may
implement calls into APIC models differently when pushing in-kernel
irqchip support upstream.)

Jan



signature.asc
Description: OpenPGP digital signature


Re: [RFC PATCH 0/7] Initial support for Microsoft Hyper-V

2011-10-10 Thread Jan Kiszka
On 2011-10-09 20:52, Vadim Rozenfeld wrote:
 Enable some basic Hyper-V enlightenment functionalites,
 including relaxed timing, spinlock, and virtual APIC. 

This targets uq/master, correct? Then you should CC qemu-devel on the
next round.

I think this series could also be distributed over 3 or 4 patches
without loosing bisectability. And please spend a bit time on commit logs.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [RFC PATCH 1/7] [hyper-v] Add hyper-v parameters block.

2011-10-10 Thread Jan Kiszka
On 2011-10-09 20:52, Vadim Rozenfeld wrote:
 ---
  qemu-options.hx |   23 +++
  vl.c|2 ++
  2 files changed, 25 insertions(+), 0 deletions(-)
 
 diff --git a/qemu-options.hx b/qemu-options.hx
 index 3a13533..9f60059 100644
 --- a/qemu-options.hx
 +++ b/qemu-options.hx
 @@ -2483,6 +2483,29 @@ DEF(kvm-shadow-memory, HAS_ARG, 
 QEMU_OPTION_kvm_shadow_memory,
  allocate MEGABYTES for kvm mmu shadowing\n,
  QEMU_ARCH_I386)
  
 +DEF(hyperv, HAS_ARG, QEMU_OPTION_hyperv,
 +-hyperv [vapic=on|off][,spinlock=retries][,wd=on|off]\n
 +enable Hyper-V Enlightenment\n,
 +QEMU_ARCH_ALL)

These are CPU feature, so -cpu +/-hv_vapic,+/-hv_spinlock etc. looks
more appropriate than a new command line parameter.

BTW, documentation and maybe also option processing should make clear
that this is limited to KVM mode for now.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [RFC PATCH 3/7] [hyper-v] make Hyper-V option configurable.

2011-10-10 Thread Jan Kiszka
On 2011-10-09 20:52, Vadim Rozenfeld wrote:
 ---
  Makefile.target |1 +
  configure   |   11 +++
  2 files changed, 12 insertions(+), 0 deletions(-)
 
 diff --git a/Makefile.target b/Makefile.target
 index f84d8cb..3581480 100644
 --- a/Makefile.target
 +++ b/Makefile.target
 @@ -199,6 +199,7 @@ obj-$(CONFIG_VHOST_NET) += vhost.o
  obj-$(CONFIG_REALLY_VIRTFS) += 9pfs/virtio-9p-device.o
  obj-$(CONFIG_KVM) += kvm.o kvm-all.o
  obj-$(CONFIG_NO_KVM) += kvm-stub.o
 +obj-$(CONFIG_HYPERV) += hyperv.o
  obj-y += memory.o
  LIBS+=-lz
  
 diff --git a/configure b/configure
 index 94c7d31..f5ecfd7 100755
 --- a/configure
 +++ b/configure
 @@ -150,6 +150,7 @@ debug=no
  strip_opt=yes
  bigendian=no
  mingw32=no
 +hyperv=no
  EXESUF=
  prefix=/usr/local
  mandir=\${prefix}/share/man
 @@ -762,6 +763,10 @@ for opt do
;;
--enable-vhost-net) vhost_net=yes
;;
 +  --disable-hyperv) hyperv=no
 +  ;;
 +  --enable-hyperv) hyperv=yes
 +  ;;
--disable-opengl) opengl=no
;;
--enable-opengl) opengl=yes
 @@ -1062,6 +1067,8 @@ echo   --enable-docsenable documentation 
 build
  echo   --disable-docs   disable documentation build
  echo   --disable-vhost-net  disable vhost-net acceleration support
  echo   --enable-vhost-net   enable vhost-net acceleration support
 +echo   --enable-hyperv  enable Hyper-V support
 +echo   --disable-hyperv disable Hyper-V support
  echo   --enable-trace-backend=B Set trace backend
  echoAvailable backends: 
 $($source_path/scripts/tracetool --list-backends)
  echo   --with-trace-file=NAME   Full PATH,NAME of file to store traces
 @@ -2737,6 +2744,7 @@ echo madvise   $madvise
  echo posix_madvise $posix_madvise
  echo uuid support  $uuid
  echo vhost-net support $vhost_net
 +echo Hyper-V support   $hyperv
  echo Trace backend $trace_backend
  echo Trace output file $trace_file-pid
  echo spice support $spice
 @@ -3424,6 +3432,9 @@ case $target_arch2 in
if test $kvm_cap_device_assignment = yes ; then
  echo CONFIG_KVM_DEVICE_ASSIGNMENT=y  $config_target_mak
fi
 +  if test $hyperv = yes ; then
 +echo CONFIG_HYPERV=y  $config_target_mak
 +  fi
  fi
  esac
  if test $target_bigendian = yes ; then

Why do I want to --disable-hyperv? It rather looks like we could
perfectly live with this feature built by default. Would also allow to
drop the nasty #ifdefs from the code.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware

2011-10-10 Thread Ohad Ben-Cohen
On Sun, Oct 2, 2011 at 5:58 PM, Ohad Ben-Cohen o...@wizery.com wrote:
 Ok, fair enough. I've revised the patches and attached the main one
 below; please tell me if it looks ok, and then I'll resubmit the
 entire patch set.

Ping ?
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks

2011-10-10 Thread Ingo Molnar

* Jeremy Fitzhardinge jer...@goop.org wrote:

 On 10/06/2011 10:40 AM, Jeremy Fitzhardinge wrote:
  However, it looks like locked xadd is also has better performance:  on
  my Sandybridge laptop (2 cores, 4 threads), the add+mfence is 20% slower
  than locked xadd, so that pretty much settles it unless you think
  there'd be a dramatic difference on an AMD system.
 
 Konrad measures add+mfence is about 65% slower on AMD Phenom as well.

xadd also results in smaller/tighter code, right?

Thanks,

Ingo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at include/linux/kvm_host.h:603!

2011-10-10 Thread Alexander Graf
Hi Jörg,

On 07.10.2011, at 23:10, Jörg Sommer wrote:

 Hi,
 
 I've got this backtrace:
 
 [130902.709711] [ cut here ]
 [130902.709747] kernel BUG at include/linux/kvm_host.h:603!

Ouch. This means that preemption is broken in KVM for PPC. To quickly get 
things working on your side, please recompile your kernel with 
CONFIG_PREEMPT_NONE. I'll take a look at fixing it for real ASAP.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 0/7] Initial support for Microsoft Hyper-V

2011-10-10 Thread Vadim Rozenfeld
On Mon, 2011-10-10 at 08:53 +0200, Jan Kiszka wrote:
 On 2011-10-09 20:52, Vadim Rozenfeld wrote:
  Enable some basic Hyper-V enlightenment functionalites,
  including relaxed timing, spinlock, and virtual APIC. 
 
 This targets uq/master, correct? Then you should CC qemu-devel on the
 next round.
 
 I think this series could also be distributed over 3 or 4 patches
 without loosing bisectability. And please spend a bit time on commit logs.
 
OK.
 Jan
 


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH 1/7] [hyper-v] Add hyper-v parameters block.

2011-10-10 Thread Vadim Rozenfeld
On Mon, 2011-10-10 at 08:54 +0200, Jan Kiszka wrote:
 On 2011-10-09 20:52, Vadim Rozenfeld wrote:
  ---
   qemu-options.hx |   23 +++
   vl.c|2 ++
   2 files changed, 25 insertions(+), 0 deletions(-)
  
  diff --git a/qemu-options.hx b/qemu-options.hx
  index 3a13533..9f60059 100644
  --- a/qemu-options.hx
  +++ b/qemu-options.hx
  @@ -2483,6 +2483,29 @@ DEF(kvm-shadow-memory, HAS_ARG, 
  QEMU_OPTION_kvm_shadow_memory,
   allocate MEGABYTES for kvm mmu shadowing\n,
   QEMU_ARCH_I386)
   
  +DEF(hyperv, HAS_ARG, QEMU_OPTION_hyperv,
  +-hyperv [vapic=on|off][,spinlock=retries][,wd=on|off]\n
  +enable Hyper-V Enlightenment\n,
  +QEMU_ARCH_ALL)
 
 These are CPU feature, so -cpu +/-hv_vapic,+/-hv_spinlock etc. looks
 more appropriate than a new command line parameter.
 
I would like to keep hyper-v settings apart from cpu features for a very
simple reason: if hyper-v VMBus support will be added one day, it won't
be a CPU only feature anymore.   
 BTW, documentation and maybe also option processing should make clear
 that this is limited to KVM mode for now.
 
Will add it.
Vadim
 Jan
 


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu-kvm: Deprecate drive parameter boot=on|off

2011-10-10 Thread Avi Kivity

On 10/08/2011 09:46 AM, Jan Kiszka wrote:

We do not want to maintain this option forever. It will be removed after
a grace period of a few releases. So warn the user that this option has
no effect and will become invalid soon.



Thanks, applied.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] qemu-kvm: fix improper nmi emulation

2011-10-10 Thread Andreas Färber

Am 10.10.2011 08:49, schrieb Jan Kiszka:

On 2011-10-10 08:06, Lai Jiangshan wrote:

From: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com

Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
button event happens. This doesn't properly emulate real hardware on
which NMI button event triggers LINT1. Because of this, NMI is sent to
the processor even when LINT1 is maskied in LVT. For example, this

[...]

This is targeting uq/master?

Please make sure your patch passes checkpatch.pl


While at it: masked?

Andreas
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware

2011-10-10 Thread Roedel, Joerg
Hi Ohad,

sorry, I was on vacation last week and had no time to look into this.

On Sun, Oct 02, 2011 at 11:58:12AM -0400, Ohad Ben-Cohen wrote:
  drivers/iommu/iommu.c  |  138 ---
  drivers/iommu/omap-iovmm.c |   12 +---
  include/linux/iommu.h  |6 +-
  virt/kvm/iommu.c   |4 +-
  4 files changed, 137 insertions(+), 23 deletions(-)
 
 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
 index a7b0862..f23563f 100644
 --- a/drivers/iommu/iommu.c
 +++ b/drivers/iommu/iommu.c
 @@ -16,6 +16,8 @@
   * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
   */
 
 +#define pr_fmt(fmt)%s:  fmt, __func__
 +
  #include linux/kernel.h
  #include linux/bug.h
  #include linux/types.h
 @@ -23,15 +25,54 @@
  #include linux/slab.h
  #include linux/errno.h
  #include linux/iommu.h
 +#include linux/bitmap.h

Is this still required?

 
  static struct iommu_ops *iommu_ops;
 
 +/* bitmap of supported page sizes */
 +static unsigned long iommu_pgsize_bitmap;
 +
 +/* size of the smallest supported page (in bytes) */
 +static unsigned int iommu_min_pagesz;
 +
 +/**
 + * register_iommu() - register an IOMMU hardware
 + * @ops: iommu handlers
 + * @pgsize_bitmap: bitmap of page sizes supported by the hardware
 + *
 + * Note: this is a temporary function, which will be removed once
 + * all IOMMU drivers are converted. The only reason it exists is to
 + * allow splitting the pgsizes changes to several patches in order to ease
 + * the review.
 + */
 +void register_iommu_pgsize(struct iommu_ops *ops, unsigned long 
 pgsize_bitmap)
 +{
 +   if (iommu_ops || iommu_pgsize_bitmap || !pgsize_bitmap)
 +   BUG();
 +
 +   iommu_ops = ops;
 +   iommu_pgsize_bitmap = pgsize_bitmap;
 +
 +   /* find out the minimum page size only once */
 +   iommu_min_pagesz = 1  __ffs(pgsize_bitmap);
 +}

Hmm, I thought a little bit about that and came to the conculusion it
might be best to just keep the page-sizes as a part of the iommu_ops
structure. So there is no need to extend the register_iommu interface.

Also, the bus_set_iommu interface is now in the -next branch. Would be
good if you rebase the patches to that interface.

You can find the current iommu tree with these changes at

git://git.8bytes.org/scm/iommu.git

 @@ -115,26 +156,103 @@ int iommu_domain_has_cap(struct iommu_domain *domain,
  EXPORT_SYMBOL_GPL(iommu_domain_has_cap);
 
  int iommu_map(struct iommu_domain *domain, unsigned long iova,
 - phys_addr_t paddr, int gfp_order, int prot)
 + phys_addr_t paddr, size_t size, int prot)
  {
 -   size_t size;
 +   int ret = 0;
 +
 +   /*
 +* both the virtual address and the physical one, as well as
 +* the size of the mapping, must be aligned (at least) to the
 +* size of the smallest page supported by the hardware
 +*/
 +   if (!IS_ALIGNED(iova | paddr | size, iommu_min_pagesz)) {
 +   pr_err(unaligned: iova 0x%lx pa 0x%lx size 0x%lx min_pagesz 
 +   0x%x\n, iova, (unsigned long)paddr,
 +   (unsigned long)size, iommu_min_pagesz);
 +   return -EINVAL;
 +   }
 
 -   size = 0x1000UL  gfp_order;
 +   pr_debug(map: iova 0x%lx pa 0x%lx size 0x%lx\n, iova,
 +   (unsigned long)paddr, (unsigned long)size);
 
 -   BUG_ON(!IS_ALIGNED(iova | paddr, size));
 +   while (size) {
 +   unsigned long pgsize, addr_merge = iova | paddr;
 +   unsigned int pgsize_idx;
 
 -   return iommu_ops-map(domain, iova, paddr, gfp_order, prot);
 +   /* Max page size that still fits into 'size' */
 +   pgsize_idx = __fls(size);
 +
 +   /* need to consider alignment requirements ? */
 +   if (likely(addr_merge)) {
 +   /* Max page size allowed by both iova and paddr */
 +   unsigned int align_pgsize_idx = __ffs(addr_merge);
 +
 +   pgsize_idx = min(pgsize_idx, align_pgsize_idx);
 +   }
 +
 +   /* build a mask of acceptable page sizes */
 +   pgsize = (1UL  (pgsize_idx + 1)) - 1;
 +
 +   /* throw away page sizes not supported by the hardware */
 +   pgsize = iommu_pgsize_bitmap;

I think we need some care here and check pgsize for 0. A BUG_ON should
do.

 +
 +   /* pick the biggest page */
 +   pgsize_idx = __fls(pgsize);
 +   pgsize = 1UL  pgsize_idx;
 +
 +   /* convert index to page order */
 +   pgsize_idx -= PAGE_SHIFT;
 +
 +   pr_debug(mapping: iova 0x%lx pa 0x%lx order %u\n, iova,
 +   (unsigned long)paddr, pgsize_idx);
 +
 +   ret = iommu_ops-map(domain, iova, paddr, pgsize_idx, prot);
 +   if (ret)
 +   break;
 

Re: [RFC PATCH 1/7] [hyper-v] Add hyper-v parameters block.

2011-10-10 Thread Jan Kiszka
On 2011-10-10 11:40, Vadim Rozenfeld wrote:
 On Mon, 2011-10-10 at 08:54 +0200, Jan Kiszka wrote:
 On 2011-10-09 20:52, Vadim Rozenfeld wrote:
 ---
  qemu-options.hx |   23 +++
  vl.c|2 ++
  2 files changed, 25 insertions(+), 0 deletions(-)

 diff --git a/qemu-options.hx b/qemu-options.hx
 index 3a13533..9f60059 100644
 --- a/qemu-options.hx
 +++ b/qemu-options.hx
 @@ -2483,6 +2483,29 @@ DEF(kvm-shadow-memory, HAS_ARG, 
 QEMU_OPTION_kvm_shadow_memory,
  allocate MEGABYTES for kvm mmu shadowing\n,
  QEMU_ARCH_I386)
  
 +DEF(hyperv, HAS_ARG, QEMU_OPTION_hyperv,
 +-hyperv [vapic=on|off][,spinlock=retries][,wd=on|off]\n
 +enable Hyper-V Enlightenment\n,
 +QEMU_ARCH_ALL)

 These are CPU feature, so -cpu +/-hv_vapic,+/-hv_spinlock etc. looks
 more appropriate than a new command line parameter.

 I would like to keep hyper-v settings apart from cpu features for a very
 simple reason: if hyper-v VMBus support will be added one day, it won't
 be a CPU only feature anymore.   

Then that feature would be controlled by adding the corresponding
device. There is no need for -hyperv.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] kernel/kvm: fix improper nmi emulation

2011-10-10 Thread Avi Kivity

On 10/10/2011 08:06 AM, Lai Jiangshan wrote:

From: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com

Currently, NMI interrupt is blindly sent to all the vCPUs when NMI
button event happens. This doesn't properly emulate real hardware on
which NMI button event triggers LINT1. Because of this, NMI is sent to
the processor even when LINT1 is maskied in LVT. For example, this
causes the problem that kdump initiated by NMI sometimes doesn't work
on KVM, because kdump assumes NMI is masked on CPUs other than CPU0.

With this patch, KVM_NMI ioctl is handled as follows.

- When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a
   request of triggering LINT1 on the processor. LINT1 is emulated in
   in-kernel irqchip.

- When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a
   request of injecting NMI to the processor. This assumes LINT1 is
   already emulated in userland.


Please add a KVM_NMI section to Documentation/virtual/kvm/api.txt.



-static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu)
-{
-   kvm_inject_nmi(vcpu);
-
-   return 0;
-}
-
  static int vcpu_ioctl_tpr_access_reporting(struct kvm_vcpu *vcpu,
   struct kvm_tpr_access_ctl *tac)
  {
@@ -3038,9 +3031,10 @@ long kvm_arch_vcpu_ioctl(struct file *fi
break;
}
case KVM_NMI: {
-   r = kvm_vcpu_ioctl_nmi(vcpu);
-   if (r)
-   goto out;
+   if (irqchip_in_kernel(vcpu-kvm))
+   kvm_apic_lint1_deliver(vcpu);
+   else
+   kvm_inject_nmi(vcpu);
r = 0;
break;
}


Why did you drop kvm_vcpu_ioctl_nmi()?

Please add (and document) a KVM_CAP flag that lets userspace know the 
new behaviour is supported.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] [kvm-autotest] cgroup-kvm: add_*_drive / rm_drive

2011-10-10 Thread Jiri Zupka
This is useful function. This function can be in kvm utils.

- Original Message -
 * functions for adding and removal of drive to vm using host-file or
host-scsi_debug device.
 
 Signed-off-by: Lukas Doktor ldok...@redhat.com
 ---
  client/tests/kvm/tests/cgroup.py |  125
  -
  1 files changed, 108 insertions(+), 17 deletions(-)
 
 diff --git a/client/tests/kvm/tests/cgroup.py
 b/client/tests/kvm/tests/cgroup.py
 index b9a10ea..d6418b5 100644
 --- a/client/tests/kvm/tests/cgroup.py
 +++ b/client/tests/kvm/tests/cgroup.py
 @@ -17,6 +17,108 @@ def run_cgroup(test, params, env):
  vms = None
  tests = None
  
 +# Func
 +def get_device_driver():
 +
 +Discovers the used block device driver {ide, scsi,
 virtio_blk}
 +@return: Used block device driver {ide, scsi, virtio}
 +
 +if test.tagged_testname.count('virtio_blk'):
 +return virtio
 +elif test.tagged_testname.count('scsi'):
 +return scsi
 +else:
 +return ide
 +
 +
 +def add_file_drive(vm, driver=get_device_driver(),
 host_file=None):
 +
 +Hot-add a drive based on file to a vm
 +@param vm: Desired VM
 +@param driver: which driver should be used (default: same as
 in test)
 +@param host_file: Which file on host is the image (default:
 create new)
 +@return: Tupple(ret_file, device)
 +ret_file: created file handler (None if not
 created)
 +device: PCI id of the virtual disk
 +
 +if not host_file:
 +host_file =
 tempfile.NamedTemporaryFile(prefix=cgroup-disk-,
 +   suffix=.iso)
 +utils.system(dd if=/dev/zero of=%s bs=1M count=8
 /dev/null
 + % (host_file.name))
 +ret_file = host_file
 +else:
 +ret_file = None
 +
 +out = vm.monitor.cmd(pci_add auto storage
 file=%s,if=%s,snapshot=off,
 + cache=off % (host_file.name, driver))
 +dev = re.search(r'OK domain (\d+), bus (\d+), slot (\d+),
 function \d+',
 +out)
 +if not dev:
 +raise error.TestFail(Can't add device(%s, %s, %s): %s
 % (vm,
 +host_file.name,
 driver, out))
 +device = %s:%s:%s % dev.groups()
 +return (ret_file, device)
 +
 +
 +def add_scsi_drive(vm, driver=get_device_driver(),
 host_file=None):
 +
 +Hot-add a drive based on scsi_debug device to a vm
 +@param vm: Desired VM
 +@param driver: which driver should be used (default: same as
 in test)
 +@param host_file: Which dev on host is the image (default:
 create new)
 +@return: Tupple(ret_file, device)
 +ret_file: string of the created dev (None if not
 created)
 +device: PCI id of the virtual disk
 +
 +if not host_file:
 +if utils.system_output(lsmod | grep scsi_debug -c) ==
 0:
 +utils.system(modprobe scsi_debug dev_size_mb=8
 add_host=0)
 +utils.system(echo 1 
 /sys/bus/pseudo/drivers/scsi_debug/add_host)
 +host_file = utils.system_output(ls /dev/sd* | tail -n
 1)
 +# Enable idling in scsi_debug drive
 +utils.system(echo 1  /sys/block/%s/queue/rotational %
 host_file)
 +ret_file = host_file
 +else:
 +# Don't remove this device during cleanup
 +# Reenable idling in scsi_debug drive (in case it's not)
 +utils.system(echo 1  /sys/block/%s/queue/rotational %
 host_file)
 +ret_file = None
 +
 +out = vm.monitor.cmd(pci_add auto storage
 file=%s,if=%s,snapshot=off,
 + cache=off % (host_file, driver))
 +dev = re.search(r'OK domain (\d+), bus (\d+), slot (\d+),
 function \d+',
 +out)
 +if not dev:
 +raise error.TestFail(Can't add device(%s, %s, %s): %s
 % (vm,
 +host_file,
 driver, out))
 +device = %s:%s:%s % dev.groups()
 +return (ret_file, device)
 +
 +
 +def rm_drive(vm, host_file, device):
 +
 +Remove drive from vm and device on disk
 +! beware to remove scsi devices in reverse order !
 +
 +vm.monitor.cmd(pci_del %s % device)
 +
 +if isinstance(host_file, file): # file
 +host_file.close()
 +elif isinstance(host_file, str):# scsi device
 +utils.system(echo -1
 /sys/bus/pseudo/drivers/scsi_debug/add_host)
 +else:# custom file, do nothing
 +pass
 +
 +def get_all_pids(ppid):
 +
 +Get all PIDs of children/threads of parent ppid
 +param ppid: parent PID
 +return: list of PIDs 

Re: [PATCH] apic: test tsc deadline timer

2011-10-10 Thread Avi Kivity

On 10/09/2011 05:32 PM, Liu, Jinsong wrote:

Updated test case for kvm tsc deadline timer 
https://github.com/avikivity/kvm-unit-tests, as attached.



Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Update README example

2011-10-10 Thread Avi Kivity

On 10/09/2011 06:02 PM, Liu, Jinsong wrote:

Subject: [PATCH] Update README example




Thanks, applied.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] virtio-9p: fix QEMU build break

2011-10-10 Thread Zhi Yong Wu
qemu build break due to the redefinition of struct file_handle. My 
qemu.git/HEAD is 8acbc9b21d757a6be4f8492e547b8159703a0547

Below is the log:
[root@f15 qemu]# make
  CCqapi-generated/qga-qapi-types.o
  LINK  qemu-ga
  CClibhw64/9pfs/virtio-9p-handle.o
/home/zwu/work/virt/qemu/hw/9pfs/virtio-9p-handle.c:31:8: error: redefinition 
of struct file_handle
/usr/include/bits/fcntl.h:254:8: note: originally defined here
make[1]: *** [9pfs/virtio-9p-handle.o] Error 1
make: *** [subdir-libhw64] Error 2

[root@f15 qemu]# rpm -qf /usr/include/bits/fcntl.h
glibc-headers-2.13.90-9.x86_64

Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com
---
 hw/9pfs/virtio-9p-handle.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c
index 5c8b5ed..5b3a867 100644
--- a/hw/9pfs/virtio-9p-handle.c
+++ b/hw/9pfs/virtio-9p-handle.c
@@ -27,7 +27,7 @@ struct handle_data {
 int handle_bytes;
 };
 
-#if __GLIBC__ = 2  __GLIBC_MINOR__  14
+#if __GLIBC__ = 2  __GLIBC_MINOR__  13
 struct file_handle {
 unsigned int handle_bytes;
 int handle_type;
-- 
1.7.6

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][uq/master] kvm: Add top-like kvm statistics script

2011-10-10 Thread Avi Kivity

On 10/07/2011 09:37 AM, Jan Kiszka wrote:

Taken from original qemu-kvm/kvm/kvm_stat.




Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][uq/master] kvm: Add tool for querying VMX capabilities

2011-10-10 Thread Avi Kivity

On 10/07/2011 09:37 AM, Jan Kiszka wrote:

Taken from original qemu-kvm/kvm/scripts/vmxcap.




Applied, thanks.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 0/9] perf support for x86 guest/host-only bits

2011-10-10 Thread Roedel, Joerg
Hi Gleb,

On Wed, Oct 05, 2011 at 08:01:15AM -0400, Gleb Natapov wrote:
 This patch series consists of Joerg series named perf support for amd
 guest/host-only bits v2 [1] rebased to 3.1.0-rc7 and in addition,
 support for intel cpus for the same functionality.
 
 [1] https://lkml.org/lkml/2011/6/17/171
 
 Changelog:
  v1-v2
   - move perf_guest_switch_msr array to perf code.
   - small cosmetic changes.
 
 Gleb Natapov (4):
   perf, intel: Use GO/HO bits in perf-ctr
   KVM, VMX: add support for switching of PERF_GLOBAL_CTRL
   KVM, VMX: Add support for guest/host-only profiling
   KVM, VMX: Check for automatic switch msr table overflow.
 
 Joerg Roedel (5):
   perf, core: Introduce attrs to count in either host or guest mode
   perf, amd: Use GO/HO bits in perf-ctr
   perf, tools: Add support for guest/host-only profiling
   perf, tools: Fix copypaste error in perf-kvm option description
   perf, tools: Do guest-only counting in perf-kvm by default
 
  arch/x86/include/asm/perf_event.h  |   15 
  arch/x86/kernel/cpu/perf_event.c   |   14 
  arch/x86/kernel/cpu/perf_event_amd.c   |   13 +++
  arch/x86/kernel/cpu/perf_event_intel.c |   90 +-
  arch/x86/kvm/vmx.c |  131 
 +---
  include/linux/perf_event.h |5 +-
  tools/perf/builtin-kvm.c   |5 +-
  tools/perf/util/event.c|8 ++
  tools/perf/util/event.h|2 +
  tools/perf/util/evlist.c   |5 +-
  tools/perf/util/parse-events.c |   15 +++-
  11 files changed, 282 insertions(+), 21 deletions(-)

Many thanks for picking this up :)

Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/4] [kvm-autotest] cgroup-kvm: add_*_drive / rm_drive

2011-10-10 Thread Lukáš Doktor
I thought about that. But pci_add is not much stable and it's not 
supported in QMP (as far as I read) with a note that this way is buggy 
and should be rewritten completely. So I placed it here to let it 
develop and then I can move it into utils.


Regards,
Lukáš

Dne 10.10.2011 12:26, Jiri Zupka napsal(a):

This is useful function. This function can be in kvm utils.

- Original Message -

* functions for adding and removal of drive to vm using host-file or
host-scsi_debug device.

Signed-off-by: Lukas Doktorldok...@redhat.com
---
  client/tests/kvm/tests/cgroup.py |  125
  -
  1 files changed, 108 insertions(+), 17 deletions(-)

diff --git a/client/tests/kvm/tests/cgroup.py
b/client/tests/kvm/tests/cgroup.py
index b9a10ea..d6418b5 100644
--- a/client/tests/kvm/tests/cgroup.py
+++ b/client/tests/kvm/tests/cgroup.py
@@ -17,6 +17,108 @@ def run_cgroup(test, params, env):
  vms = None
  tests = None

+# Func
+def get_device_driver():
+
+Discovers the used block device driver {ide, scsi,
virtio_blk}
+@return: Used block device driver {ide, scsi, virtio}
+
+if test.tagged_testname.count('virtio_blk'):
+return virtio
+elif test.tagged_testname.count('scsi'):
+return scsi
+else:
+return ide
+
+
+def add_file_drive(vm, driver=get_device_driver(),
host_file=None):
+
+Hot-add a drive based on file to a vm
+@param vm: Desired VM
+@param driver: which driver should be used (default: same as
in test)
+@param host_file: Which file on host is the image (default:
create new)
+@return: Tupple(ret_file, device)
+ret_file: created file handler (None if not
created)
+device: PCI id of the virtual disk
+
+if not host_file:
+host_file =
tempfile.NamedTemporaryFile(prefix=cgroup-disk-,
+   suffix=.iso)
+utils.system(dd if=/dev/zero of=%s bs=1M count=8
/dev/null
+ % (host_file.name))
+ret_file = host_file
+else:
+ret_file = None
+
+out = vm.monitor.cmd(pci_add auto storage
file=%s,if=%s,snapshot=off,
+ cache=off % (host_file.name, driver))
+dev = re.search(r'OK domain (\d+), bus (\d+), slot (\d+),
function \d+',
+out)
+if not dev:
+raise error.TestFail(Can't add device(%s, %s, %s): %s
% (vm,
+host_file.name,
driver, out))
+device = %s:%s:%s % dev.groups()
+return (ret_file, device)
+
+
+def add_scsi_drive(vm, driver=get_device_driver(),
host_file=None):
+
+Hot-add a drive based on scsi_debug device to a vm
+@param vm: Desired VM
+@param driver: which driver should be used (default: same as
in test)
+@param host_file: Which dev on host is the image (default:
create new)
+@return: Tupple(ret_file, device)
+ret_file: string of the created dev (None if not
created)
+device: PCI id of the virtual disk
+
+if not host_file:
+if utils.system_output(lsmod | grep scsi_debug -c) ==
0:
+utils.system(modprobe scsi_debug dev_size_mb=8
add_host=0)
+utils.system(echo 1
/sys/bus/pseudo/drivers/scsi_debug/add_host)
+host_file = utils.system_output(ls /dev/sd* | tail -n
1)
+# Enable idling in scsi_debug drive
+utils.system(echo 1  /sys/block/%s/queue/rotational %
host_file)
+ret_file = host_file
+else:
+# Don't remove this device during cleanup
+# Reenable idling in scsi_debug drive (in case it's not)
+utils.system(echo 1  /sys/block/%s/queue/rotational %
host_file)
+ret_file = None
+
+out = vm.monitor.cmd(pci_add auto storage
file=%s,if=%s,snapshot=off,
+ cache=off % (host_file, driver))
+dev = re.search(r'OK domain (\d+), bus (\d+), slot (\d+),
function \d+',
+out)
+if not dev:
+raise error.TestFail(Can't add device(%s, %s, %s): %s
% (vm,
+host_file,
driver, out))
+device = %s:%s:%s % dev.groups()
+return (ret_file, device)
+
+
+def rm_drive(vm, host_file, device):
+
+Remove drive from vm and device on disk
+! beware to remove scsi devices in reverse order !
+
+vm.monitor.cmd(pci_del %s % device)
+
+if isinstance(host_file, file): # file
+host_file.close()
+elif isinstance(host_file, str):# scsi device
+utils.system(echo -1
/sys/bus/pseudo/drivers/scsi_debug/add_host)
+else:# custom file, do nothing
+

Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks

2011-10-10 Thread Stephan Diestelhorst
On Thursday 06 October 2011, 13:40:01 Jeremy Fitzhardinge wrote:
 On 10/06/2011 07:04 AM, Stephan Diestelhorst wrote:
  On Wednesday 28 September 2011, 14:49:56 Linus Torvalds wrote:
  Which certainly should *work*, but from a conceptual standpoint, isn't
  it just *much* nicer to say we actually know *exactly* what the upper
  bits were.
  Well, we really do NOT want atomicity here. What we really rather want
  is sequentiality: free the lock, make the update visible, and THEN
  check if someone has gone sleeping on it.
 
  Atomicity only conveniently enforces that the three do not happen in a
  different order (with the store becoming visible after the checking
  load).
 
  This does not have to be atomic, since spurious wakeups are not a
  problem, in particular not with the FIFO-ness of ticket locks.
 
  For that the fence, additional atomic etc. would be IMHO much cleaner
  than the crazy overflow logic.
 
 All things being equal I'd prefer lock-xadd just because its easier to
 analyze the concurrency for, crazy overflow tests or no.  But if
 add+mfence turned out to be a performance win, then that would obviously
 tip the scales.
 
 However, it looks like locked xadd is also has better performance:  on
 my Sandybridge laptop (2 cores, 4 threads), the add+mfence is 20% slower
 than locked xadd, so that pretty much settles it unless you think
 there'd be a dramatic difference on an AMD system.

Indeed, the fences are usually slower than locked RMWs, in particular,
if you do not need to add an instruction. I originally missed that
amazing stunt the GCC pulled off with replacing the branch with carry
flag magic. It seems that two twisted minds have found each other
here :)

One of my concerns was adding a branch in here... so that is settled,
and if everybody else feels like this is easier to reason about...
go ahead :) (I'll keep my itch to myself then.)

Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelho...@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH
Einsteinring 24
85609 Aschheim
Germany

Geschaeftsfuehrer: Alberto Bozzo;
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632, WEEE-Reg-Nr: DE 12919551 

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


KVM call agenda for October 11th

2011-10-10 Thread Juan Quintela

Hi

Please send in any agenda items you are interested in covering.

Thanks, Juan.


pgp2ZkeuIbtbB.pgp
Description: PGP signature


Re: [kvm] Re: tcpdump locks up kvm host for a while.

2011-10-10 Thread Avi Kivity

On 10/05/2011 10:29 PM, Robin Lee Powell wrote:


  #
  # (For a higher level overview, try: perf report --sort comm,dso)
  #

  How helpful is that?  -_-

  I'm guessing I need --guestkallsyms= ; since they're all the same
  kernel I thought it'd figure it out.  I'll redo.

OK, here's a better version.

# Events: 46K cycles
#
# Overhead   CommandShared Object   Symbol
#     ...  ...
#
 74.81%  qemu-kvm  [unknown][u] 0x7fbdffd4c18a


This is in userspace, so it seems the guest wasn't completely stuck.

Try 'top -b' inside the guest to record what happens, let's see what 
processes this is and go from there.



 25.14%  qemu-kvm  [guest.kernel.kallsyms]  [g] 0x82f0


This doesn't resolve, please make sure the kernel-debuginfo package is 
installed in the guest and use the guestmount option.  (or you can 
install it in the host, I think)



--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] pci-assign: Fix MSI-X registration

2011-10-10 Thread Avi Kivity

On 09/22/2011 12:04 PM, Jan Kiszka wrote:

   goto out;

  +if (!kvm_check_extension(kvm_state, KVM_CAP_ASSIGN_DEV_IRQ)
  +(dev-cap.available  ASSIGNED_DEVICE_CAP_MSIX ||
  + dev-cap.available  ASSIGNED_DEVICE_CAP_MSI ||
  + assigned_dev_pci_read_byte(pci_dev, PCI_INTERRUPT_PIN) != 0)) {
  +goto out;
  +}
  +

That's not equivalent as it needlessly prevents IRQ support in the
absence of KVM_CAP_ASSIGN_DEV_IRQ.

Let's just fix the core issue and replace the test for
KVM_CAP_DEVICE_MSIX with a test call of KVM_ASSIGN_SET_MSIX_NR, passing
in a NULL struct. If it returns -EFAULT, the IOCTL is known and MSIX is
supported.



Or just add KVM_CAP_DEVICE_MSIX to the kernel and backport it where needed?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware

2011-10-10 Thread Ohad Ben-Cohen
[ -bouncing hiroshi.d...@nokia.com, +not-bouncing hd...@nvidia.com :
hi Hiroshi :) ]

Hi Joerg,

On Mon, Oct 10, 2011 at 11:47 AM, Roedel, Joerg joerg.roe...@amd.com wrote:
 sorry, I was on vacation last week and had no time to look into this.

Sure thing, thanks for replying!

 +#include linux/bitmap.h

 Is this still required?

Nope, removed, thanks.

 Hmm, I thought a little bit about that and came to the conculusion it
 might be best to just keep the page-sizes as a part of the iommu_ops
 structure. So there is no need to extend the register_iommu interface.

Sure. That was one of my initial alternatives, but I decided against
it at that time. I'll bring it back - it will help with the
bus_set_iommu rebasing.

 Also, the bus_set_iommu interface is now in the -next branch. Would be
 good if you rebase the patches to that interface.

Sure. It's a little tricky though: which branch do I base this on ?
Are you ok with me basing this on your 'next' branch ? My current
stack depends at least on three branches of yours, so that would be
helpful for me (and less merging conflicts for you I guess :).

 I think we need some care here and check pgsize for 0. A BUG_ON should
 do.

I can add it if you prefer, but I don't think it can really happen:
basically, it means that we chose a too small and unsupported page
bit, which can't happen as long as we check for IS_ALIGNED(iova |
paddr | size, iommu_min_pagesz) in the beginning of iommu_map.

 +               unmapped_order = iommu_ops-unmap(domain, iova, order);

 I think we should make sure that we call iommu_ops-unmap with the same
 parameters as iommu_ops-map. Otherwise we still need some page-size
 complexity in the iommu-drivers.

Ok, let's discuss the semantics of -unmap().

There isn't a clear documentation of that API (we should probably add
some kernel docs after we nail it down now), but judging from the
existing users (mainly kvm) and drivers, it seems that iommu_map() and
iommu_unmap() aren't symmetric: users rely on unmap() to return the
actual size that was unmapped. IOMMU drivers, in turn, should check
which page is mapped on 'iova', unmap it, and return its size.

This way iommu_unmap() becomes very simple: it just iterates through
the region, relying on iommu_ops-unmap() to return the sizes that
were actually unmapped (very similar to how amd's iommu_unmap_page
works today). This also means that iommu_ops-unmap() doesn't really
need a size/order argument and we can remove it (after all drivers
fully migrate..).

The other approach which you suggest means symmetric iommu_map() and
iommu_unmap(). It means adding a 'paddr' parameter to iommu_unmap(),
which is easy, but maybe more concerning is the limitation that it
incurs: users will now have to call iommu_unmap() exactly as they
called iommu_map() beforehand. Note sure how well this will fly with
the existing users (kvm ?) and whether we really want to enforce this
(it doesn't mean drivers need to deal with page-size complexity. they
are required to unmap a single page at a time, and iommu_unmap() will
do the work for them).

Another discussion:

I think we better change iommu_ops-map() to directly take a 'size'
(in bytes) instead of an 'order' (of pages). Most (all?) drivers just
immediately do 'size = 0x1000UL  gfp_order', so this whole size -
order - size back and forth seems redundant.

 When we pass the size now it makes sense to also return the
 unmapped-size instead of the order.

Sure.

Thanks for your review,
Ohad.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware

2011-10-10 Thread Ohad Ben-Cohen
On Mon, Oct 10, 2011 at 2:52 PM, KyongHo Cho pullip@samsung.com wrote:
 Do not we need to unmap all intermediate mappings if iommu_map() is failed?

Good idea, I'll add it.

Thanks!
Ohad.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks

2011-10-10 Thread Stephan Diestelhorst
On Monday 10 October 2011, 07:00:50 Stephan Diestelhorst wrote:
 On Thursday 06 October 2011, 13:40:01 Jeremy Fitzhardinge wrote:
  On 10/06/2011 07:04 AM, Stephan Diestelhorst wrote:
   On Wednesday 28 September 2011, 14:49:56 Linus Torvalds wrote:
   Which certainly should *work*, but from a conceptual standpoint, isn't
   it just *much* nicer to say we actually know *exactly* what the upper
   bits were.
   Well, we really do NOT want atomicity here. What we really rather want
   is sequentiality: free the lock, make the update visible, and THEN
   check if someone has gone sleeping on it.
  
   Atomicity only conveniently enforces that the three do not happen in a
   different order (with the store becoming visible after the checking
   load).
  
   This does not have to be atomic, since spurious wakeups are not a
   problem, in particular not with the FIFO-ness of ticket locks.
  
   For that the fence, additional atomic etc. would be IMHO much cleaner
   than the crazy overflow logic.
  
  All things being equal I'd prefer lock-xadd just because its easier to
  analyze the concurrency for, crazy overflow tests or no.  But if
  add+mfence turned out to be a performance win, then that would obviously
  tip the scales.
  
  However, it looks like locked xadd is also has better performance:  on
  my Sandybridge laptop (2 cores, 4 threads), the add+mfence is 20% slower
  than locked xadd, so that pretty much settles it unless you think
  there'd be a dramatic difference on an AMD system.
 
 Indeed, the fences are usually slower than locked RMWs, in particular,
 if you do not need to add an instruction. I originally missed that
 amazing stunt the GCC pulled off with replacing the branch with carry
 flag magic. It seems that two twisted minds have found each other
 here :)
 
 One of my concerns was adding a branch in here... so that is settled,
 and if everybody else feels like this is easier to reason about...
 go ahead :) (I'll keep my itch to myself then.)

Just that I can't... if performance is a concern, adding the LOCK
prefix to the addb outperforms the xadd significantly:

With mean over 100 runs... this comes out as follows
(on my Phenom II)

locked-add   0.648500 s   80%
add-rmwtos   0.707700 s   88%
locked-xadd  0.807600 s  100%
add-barrier  1.27 s  157%

With huge read contention added in (as cheaply as possible):
locked-add.openmp  0.640700 s  84%
add-rmwtos.openmp  0.658400 s  86%
locked-xadd.openmp 0.763800 s 100%

And the numbers for write contention are crazy, but also feature the
locked-add version:
locked-add.openmp  0.571400 s  71%
add-rmwtos.openmp  0.699900 s  87%
locked-xadd.openmp 0.800200 s 100%

Stephan
-- 
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelho...@amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH
Einsteinring 24
85609 Aschheim
Germany

Geschaeftsfuehrer: Alberto Bozzo;
Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen
Registergericht Muenchen, HRB Nr. 43632, WEEE-Reg-Nr: DE 12919551 #include stdio.h

struct {
	unsigned char flag;
	unsigned char val;
} l;

int main(int argc, char **argv)
{
	int i;

	{
		{
			for (i = 0; i  1; i++) {
l.val += 2;
asm volatile(lock or $0x0,(%%rsp) : : : memory);
if (l.flag)
	break;
asm volatile( : : : memory);
			}
			l.flag = 1;
		}
	}
	return 0;
}
#include stdio.h

struct {
	unsigned char flag;
	unsigned char val;
} l;

int main(int argc, char **argv)
{
	int i;

#   pragma omp sections 
	{
#   pragma omp section
		{
			for (i = 0; i  1; i++) {
l.val += 2;
asm volatile(lock or $0x0,(%%rsp) : : : memory);
if (l.flag)
	break;
asm volatile( : : : memory);
			}
			l.flag = 1;
		}
#   pragma omp section
		while(!l.flag)
			asm volatile(:::memory);
			//asm volatile(lock orb $0x0, %0::m(l.flag):memory);
	}
	return 0;
}
#include stdio.h

struct {
	unsigned char flag;
	unsigned char val;
} l;

int main(int argc, char **argv)
{
	int i;
	{
		{
			for (i = 0; i  1; i++) {
asm volatile(lock addb %1, %0:+m(l.val):r((char)2):memory);
if (l.flag)
	break;
asm volatile( : : : memory);
			}
			l.flag = 1;
		}
	}
	return 0;
}
#include stdio.h

union {
	struct {
		unsigned char val;
		unsigned char flag;
	};
	unsigned short lock;
} l = { 0,0 };

int main(int argc, char **argv)
{
	int i;
#   pragma omp sections 
	{
#   pragma omp section
	{

			for (i = 0; i  1; i++) {
unsigned short inc = 2;
if (l.val = (0x100 - 2))
	inc += -1  8;
asm volatile(lock; xadd %1,%0 : +m (l.lock), +r (inc) : );
if (inc  0x100)
	break;
asm volatile( : : : memory);
			}
			l.flag = 1;
		}
#   pragma omp section
	while(!l.flag)
		asm volatile(:::memory);
			//asm volatile(lock orb $0x0, %0::m(l.flag):memory);
	}
	return 0;
}
#include stdio.h

struct {
	unsigned char flag;
	unsigned char val;
} l;

int main(int argc, char **argv)
{
	int i;
#   pragma omp sections 
	{
#   pragma omp section
		

Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue

2011-10-10 Thread Anthony Liguori

On 10/03/2011 03:55 PM, Marcelo Tosatti wrote:

The following changes since commit d11cf8cc80d946dfc9a23597cd9a0bb1c487cfa7:

   etrax-dma: Remove bogus if statement (2011-10-03 10:20:13 +0200)

are available in the git repository at:
   git://github.com/avikivity/qemu.git uq/master


Pulled.  Thanks.

Are ya'll planning on moving your repo back to kernel.org or sticking with 
github?

Regards,

Anthony Liguori



Liu, Jinsong (1):
   kvm: support TSC deadline MSR

  target-i386/cpu.h |4 +++-
  target-i386/kvm.c |   14 ++
  target-i386/machine.c |1 +
  3 files changed, 18 insertions(+), 1 deletions(-)




--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue

2011-10-10 Thread Avi Kivity

On 10/10/2011 04:41 PM, Anthony Liguori wrote:

On 10/03/2011 03:55 PM, Marcelo Tosatti wrote:
The following changes since commit 
d11cf8cc80d946dfc9a23597cd9a0bb1c487cfa7:


   etrax-dma: Remove bogus if statement (2011-10-03 10:20:13 +0200)

are available in the git repository at:
   git://github.com/avikivity/qemu.git uq/master


Pulled.  Thanks.



Um, this had a comment about it regarding s/version bump/subsection/

Are ya'll planning on moving your repo back to kernel.org or sticking 
with github?


We'll move back to kernel.org as soon as we sort around the keys.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 1/1] kvm: support TSC deadline MSR

2011-10-10 Thread Anthony Liguori

On 10/04/2011 05:20 PM, Marcelo Tosatti wrote:

On Tue, Oct 04, 2011 at 07:53:42PM +0200, Avi Kivity wrote:

On 10/03/2011 10:55 PM, Marcelo Tosatti wrote:

From: Liu, Jinsongjinsong@intel.com

KVM add emulation of lapic tsc deadline timer for guest.
This patch is co-operation work at qemu side.

-#define CPU_SAVE_VERSION 12
+#define CPU_SAVE_VERSION 13




Unfortunate.  Can't we use subsections?


Yes, i'll look into it tomorrow.


Subsections are still broken at the moment although Juan has some patches. 
Bumping the version is the safe thing to do.


Regards,

Anthony Liguori







--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue

2011-10-10 Thread Anthony Liguori

On 10/10/2011 09:48 AM, Avi Kivity wrote:

On 10/10/2011 04:41 PM, Anthony Liguori wrote:

On 10/03/2011 03:55 PM, Marcelo Tosatti wrote:

The following changes since commit d11cf8cc80d946dfc9a23597cd9a0bb1c487cfa7:

etrax-dma: Remove bogus if statement (2011-10-03 10:20:13 +0200)

are available in the git repository at:
git://github.com/avikivity/qemu.git uq/master


Pulled. Thanks.



Um, this had a comment about it regarding s/version bump/subsection/


Hrm, sorry about that.  In the future, it would be helpful to explicitly 
withdrawal a PULL request.


Do you want me to revert?  FWIW, I think bumping the version is the right thing 
to do.


Regards,

Anthony Liguori




Are ya'll planning on moving your repo back to kernel.org or sticking with
github?


We'll move back to kernel.org as soon as we sort around the keys.



--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 1/1] kvm: support TSC deadline MSR

2011-10-10 Thread Avi Kivity

On 10/10/2011 04:54 PM, Anthony Liguori wrote:

On 10/04/2011 05:20 PM, Marcelo Tosatti wrote:

On Tue, Oct 04, 2011 at 07:53:42PM +0200, Avi Kivity wrote:

On 10/03/2011 10:55 PM, Marcelo Tosatti wrote:

From: Liu, Jinsongjinsong@intel.com

KVM add emulation of lapic tsc deadline timer for guest.
This patch is co-operation work at qemu side.

-#define CPU_SAVE_VERSION 12
+#define CPU_SAVE_VERSION 13




Unfortunate.  Can't we use subsections?


Yes, i'll look into it tomorrow.


Subsections are still broken at the moment although Juan has some 
patches. Bumping the version is the safe thing to do.




It's irreversible, once we release a version with a bumped ID we can't 
go back.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue

2011-10-10 Thread Avi Kivity

On 10/10/2011 04:55 PM, Anthony Liguori wrote:


Hrm, sorry about that.  In the future, it would be helpful to 
explicitly withdrawal a PULL request.


Do you want me to revert?


We'll send the revert together with the new patch.


  FWIW, I think bumping the version is the right thing to do.


Why?

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] Debian preseed support

2011-10-10 Thread Lucas Meneghel Rodrigues
This patchset adds support to debian preseed files

http://wiki.debian.org/DebianInstaller/Preseed

Comes with Ubuntu server 11.04 support. Later, more
patches adding Debian and other Ubuntu server variants
will be added.

This patchset was also sent as a pull request

https://github.com/autotest/autotest/pull/34

Please review and comment.

Lucas Meneghel Rodrigues (2):
  KVM test: Introduce debian preseed unattended file support
  KVM test: guest-os.cfg: Introduce Ubuntu 11.04 server variant

 client/tests/kvm/guest-os.cfg.sample |   34 ++
 client/tests/kvm/tests/unattended_install.py |   31 
 client/tests/kvm/unattended/Ubuntu-11-04.preseed |   42 ++
 3 files changed, 100 insertions(+), 7 deletions(-)
 create mode 100644 client/tests/kvm/unattended/Ubuntu-11-04.preseed

-- 
1.7.6.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] KVM test: Introduce debian preseed unattended file support

2011-10-10 Thread Lucas Meneghel Rodrigues
Add support to debian preseed

http://wiki.debian.org/DebianInstaller/Preseed

unattended install file format. In order to get fully
automated d-i automation, we are using initrd preseed
method (add a preseed.cfg file on top of the initrd
filesystem). Tested with Ubuntu server 11.04, will add
other debian and debian based OS variants on later patches.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/kvm/tests/unattended_install.py |   31 ++
 1 files changed, 31 insertions(+), 0 deletions(-)

diff --git a/client/tests/kvm/tests/unattended_install.py 
b/client/tests/kvm/tests/unattended_install.py
index b1d23f6..f3f5268 100644
--- a/client/tests/kvm/tests/unattended_install.py
+++ b/client/tests/kvm/tests/unattended_install.py
@@ -407,6 +407,34 @@ class UnattendedInstallConfig(object):
 doc.writexml(fp)
 
 
+def preseed_initrd(self):
+
+Puts a preseed file inside a gz compressed initrd file.
+
+Debian and Ubuntu use preseed as the OEM install mechanism. The only
+way to get fully automated setup without resorting to kernel params
+is to add a preseed.cfg file at the root of the initrd image.
+
+logging.debug(Remastering initrd.gz file with preseed file)
+dest_fname = 'preseed.cfg'
+remaster_path = os.path.join(self.image_path, initrd_remaster)
+os.makedirs(remaster_path)
+
+os.chdir(remaster_path)
+utils.run(gzip -d  ../%s | cpio --extract --make-directories 
+  --no-absolute-filenames % os.path.basename(self.initrd))
+utils.run(cp %s %s % (self.unattended_file, dest_fname))
+utils.run(find . | cpio -H newc --create | gzip -9  ../%s %
+  os.path.basename(self.initrd))
+os.chdir(self.image_path)
+utils.run(rm -rf initrd_remaster)
+contents = open(self.unattended_file).read()
+
+logging.debug(Unattended install contents:)
+for line in contents.splitlines():
+logging.debug(line)
+
+
 def setup_boot_disk(self):
 if self.unattended_file.endswith('.sif'):
 dest_fname = 'winnt.sif'
@@ -492,6 +520,9 @@ class UnattendedInstallConfig(object):
 (self.cdrom_cd1_mount, self.boot_path,
  os.path.basename(self.initrd), self.initrd))
 utils.run(initrd_fetch_cmd)
+if self.unattended_file.endswith('.preseed'):
+self.preseed_initrd()
+
 finally:
 cleanup(self.cdrom_cd1_mount)
 
-- 
1.7.6.4

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] KVM test: guest-os.cfg: Introduce Ubuntu 11.04 server variant

2011-10-10 Thread Lucas Meneghel Rodrigues
Add a Ubuntu 11.04 server variant, with unattended install set.
With this, it's possible to install the latest Ubuntu server
(as of the time of this patch). A preseed file comes together.

Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com
---
 client/tests/kvm/guest-os.cfg.sample |   34 ++
 client/tests/kvm/unattended/Ubuntu-11-04.preseed |   42 ++
 2 files changed, 69 insertions(+), 7 deletions(-)
 create mode 100644 client/tests/kvm/unattended/Ubuntu-11-04.preseed

diff --git a/client/tests/kvm/guest-os.cfg.sample 
b/client/tests/kvm/guest-os.cfg.sample
index 17d6114..f7d5a98 100644
--- a/client/tests/kvm/guest-os.cfg.sample
+++ b/client/tests/kvm/guest-os.cfg.sample
@@ -906,29 +906,35 @@ variants:
 md5sum_cd1 = d2e10420f3689faa49a004b60fb396b7
 md5sum_1m_cd1 = f7f67b5da46923a9f01da8a2b6909654
 
-- @Ubuntu:
+- Ubuntu:
 shell_prompt = ^root@.*[\#\$]\s*$
+password = 12345678
+image_name = ubuntu
+unattended_install:
+kernel = linux
+initrd = initrd
+wait_no_ack = yes
 
 variants:
-- Ubuntu-6.10-32:
+- 6.10-32:
 only install
-image_name = ubuntu-6.10-32
+image_name += -6.10-32
 steps = steps/Ubuntu-6.10-32.steps
 cdrom_cd1 = isos/linux/ubuntu-6.10-desktop-i386.iso
 md5sum_cd1 = 17fb825641571ce5888a718329efd016
 md5sum_1m_cd1 = 7531d0a84e7451d17c5d976f1c3f8509
 
-- Ubuntu-8.04-32:
+- 8.04-32:
 skip = yes
-image_name = ubuntu-8.04-32
+image_name += -8.04-32
 install:
 steps = steps/Ubuntu-8.04-32.steps
 cdrom_cd1 = 
isos/linux/ubuntu-8.04.1-desktop-i386.iso
 setup:
 steps = steps/Ubuntu-8.04-32-setupssh.steps
 
-- Ubuntu-8.10-server-32:
-image_name = ubuntu-8.10-server-32
+- 8.10-server-32:
+image_name += -8.10-server-32
 install:
 steps = steps/Ubuntu-8.10-server-32.steps
 cdrom_cd1 = isos/linux/ubuntu-8.10-server-i386.iso
@@ -937,6 +943,20 @@ variants:
 setup:
 steps = steps/Ubuntu-8.10-server-32-gcc.steps
 
+- 11.04-server-64:
+image_name += -11.04-server-64
+unattended_install:
+extra_params +=  --append 'console=ttyS0,115200 
console=tty0'
+kernel = images/ubuntu-server-11-04-64/vmlinuz
+initrd = images/ubuntu-server-11-04-64/initrd.gz
+boot_path = install
+unattended_install.cdrom:
+unattended_file = unattended/Ubuntu-11-04.preseed
+cdrom_cd1 = 
isos/linux/ubuntu-11.04-server-amd64.iso
+md5sum_cd1 = 355ca2417522cb4a77e0295bf45c5cd5
+md5sum_1m_cd1 = 65b1514744bf99e88f6228e9b6f152a8
+
+
 - DSL-4.2.5:
 no setup dbench bonnie linux_s3
 image_name = dsl-4.2.5
diff --git a/client/tests/kvm/unattended/Ubuntu-11-04.preseed 
b/client/tests/kvm/unattended/Ubuntu-11-04.preseed
new file mode 100644
index 000..b4bec84
--- /dev/null
+++ b/client/tests/kvm/unattended/Ubuntu-11-04.preseed
@@ -0,0 +1,42 @@
+debconf debconf/priority string critical
+unknown debconf/priority string critical
+d-i debconf/priority string critical
+d-i debian-installer/locale string en_US
+d-i console-tools/archs select at
+d-i console-keymaps-at/keymap select us
+
+d-i netcfg/choose_interface select auto
+d-i netcfg/get_hostname string unassigned-hostname
+d-i netcfg/get_domain string unassigned-domain
+d-i netcfg/wireless_wep string
+
+d-i clock-setup/utc boolean true
+d-i time/zone string US/Eastern
+
+d-i partman-auto/method string regular
+d-i partman-auto/choose_recipe select home
+d-i partman/confirm_write_new_label boolean true
+d-i partman/choose_partition select finish
+d-i partman/confirm boolean true
+d-i partman/confirm_nooverwrite boolean true
+
+d-i passwd/root-login boolean true
+d-i passwd/make-user boolean false
+d-i passwd/root-password password 12345678
+d-i passwd/root-password-again password 12345678
+
+tasksel tasksel/first multiselect standard
+
+d-i pkgsel/include string openssh-server build-essential
+
+d-i 

Re: [Qemu-devel] [PATCH 1/1] kvm: support TSC deadline MSR

2011-10-10 Thread Anthony Liguori

On 10/10/2011 09:58 AM, Avi Kivity wrote:

On 10/10/2011 04:54 PM, Anthony Liguori wrote:

On 10/04/2011 05:20 PM, Marcelo Tosatti wrote:

On Tue, Oct 04, 2011 at 07:53:42PM +0200, Avi Kivity wrote:

On 10/03/2011 10:55 PM, Marcelo Tosatti wrote:

From: Liu, Jinsongjinsong@intel.com

KVM add emulation of lapic tsc deadline timer for guest.
This patch is co-operation work at qemu side.

-#define CPU_SAVE_VERSION 12
+#define CPU_SAVE_VERSION 13




Unfortunate. Can't we use subsections?


Yes, i'll look into it tomorrow.


Subsections are still broken at the moment although Juan has some patches.
Bumping the version is the safe thing to do.



It's irreversible, once we release a version with a bumped ID we can't go back.


But the question is whether we've bumped *any* versions of common devices since 
0.15 because if so, it's moot here.  Once any device bumps a version id, 
migration is incompatible.


Subsections are nice for stable branches, but they don't solve inter-version 
compatibility.  Most importantly, subsections are broken today so until we 
straighten things out there, we can't rely on them.


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue

2011-10-10 Thread Anthony Liguori

On 10/10/2011 10:00 AM, Avi Kivity wrote:

On 10/10/2011 04:55 PM, Anthony Liguori wrote:


Hrm, sorry about that. In the future, it would be helpful to explicitly
withdrawal a PULL request.

Do you want me to revert?


We'll send the revert together with the new patch.


FWIW, I think bumping the version is the right thing to do.


Why?


Because we still haven't fixed subsections.  Juan's patches help but they can 
still result in sending bad data to and older QEMU that won't handle it properly.


We need to figure out a proper fix for subsections, either Paolo's protocol 
change or moving subsections out to a detected section.


Regards,

Anthony Liguori





--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue

2011-10-10 Thread Avi Kivity

On 10/10/2011 05:12 PM, Anthony Liguori wrote:

On 10/10/2011 10:00 AM, Avi Kivity wrote:

On 10/10/2011 04:55 PM, Anthony Liguori wrote:


Hrm, sorry about that. In the future, it would be helpful to explicitly
withdrawal a PULL request.

Do you want me to revert?


We'll send the revert together with the new patch.


FWIW, I think bumping the version is the right thing to do.


Why?


Because we still haven't fixed subsections.  Juan's patches help but 
they can still result in sending bad data to and older QEMU that won't 
handle it properly.


We need to figure out a proper fix for subsections, either Paolo's 
protocol change or moving subsections out to a detected section.


Ok.  So bumping the version is only right if we don't fix subsections.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 1/1] kvm: support TSC deadline MSR

2011-10-10 Thread Avi Kivity

On 10/10/2011 05:10 PM, Anthony Liguori wrote:
It's irreversible, once we release a version with a bumped ID we 
can't go back.



But the question is whether we've bumped *any* versions of common 
devices since 0.15 because if so, it's moot here. 


What's the answer?  And if the answer is we did, why did we?


Once any device bumps a version id, migration is incompatible.

Subsections are nice for stable branches, but they don't solve 
inter-version compatibility.


Why not?  Though I agree it's a long shot to get it to work, since there 
are so many changes and it's easy to get any one of them wrong.


You had an idea once to fingerprint the migration format and compare it 
across versions?  I assume the almighty visitor can make this very simple?


  Most importantly, subsections are broken today so until we 
straighten things out there, we can't rely on them.




Let's not throw a permanent baby out with the temporary bath water.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware

2011-10-10 Thread Roedel, Joerg
Hi Ohad,

On Mon, Oct 10, 2011 at 09:59:22AM -0400, Ohad Ben-Cohen wrote:
  Also, the bus_set_iommu interface is now in the -next branch. Would be
  good if you rebase the patches to that interface.
 
 Sure. It's a little tricky though: which branch do I base this on ?
 Are you ok with me basing this on your 'next' branch ? My current
 stack depends at least on three branches of yours, so that would be
 helpful for me (and less merging conflicts for you I guess :).

The master branch is best to base your patches on for generic work. For
more specific things like omap-only changes you can use the topic
branches.

  I think we need some care here and check pgsize for 0. A BUG_ON should
  do.
 
 I can add it if you prefer, but I don't think it can really happen:
 basically, it means that we chose a too small and unsupported page
 bit, which can't happen as long as we check for IS_ALIGNED(iova |
 paddr | size, iommu_min_pagesz) in the beginning of iommu_map.

It can happen when there is a bug somewhere :) So a BUG_ON will yell
then and makes debugging easier. An alternative is to use a WARN_ON and
let the map-call fail in this case.

 Ok, let's discuss the semantics of -unmap().
 
 There isn't a clear documentation of that API (we should probably add
 some kernel docs after we nail it down now), but judging from the
 existing users (mainly kvm) and drivers, it seems that iommu_map() and
 iommu_unmap() aren't symmetric: users rely on unmap() to return the
 actual size that was unmapped. IOMMU drivers, in turn, should check
 which page is mapped on 'iova', unmap it, and return its size.

Right, currently the map/unmap calls are not symetric. But I think they
should be to get a clean semantic. Without this requirement and multiple
page-sizes in use the iommu-code may has to unmap more address space then
requested. The user doesn't know what will be unmapped so it has to make
sure that no DMA is happening while unmap runs.

When we require the calls to be symetric we can give a guarantee that
only the requested region is unmapped and allow DMA to the untouched
part of the address-space while unmap() is running.

So when the call-places to not follow this restriction we should convert
them mid-term.

 This way iommu_unmap() becomes very simple: it just iterates through
 the region, relying on iommu_ops-unmap() to return the sizes that
 were actually unmapped (very similar to how amd's iommu_unmap_page
 works today). This also means that iommu_ops-unmap() doesn't really
 need a size/order argument and we can remove it (after all drivers
 fully migrate..).

Yes, somthing like that. Probably the iommu_ops-unmap function should
be turned into a unmap_page function call which only takes an iova and
no size parameter. The iommu-driver unmaps the page pointing to that
iova and returns the size of the page unmapped. This still allows the
simple implementation for the unmap-call.

This change is no requirement for this patch-set, but if we agree on it
this patch-set should keep that direction in mind.

 The other approach which you suggest means symmetric iommu_map() and
 iommu_unmap(). It means adding a 'paddr' parameter to iommu_unmap(),
 which is easy, but maybe more concerning is the limitation that it
 incurs: users will now have to call iommu_unmap() exactly as they
 called iommu_map() beforehand. Note sure how well this will fly with
 the existing users (kvm ?) and whether we really want to enforce this
 (it doesn't mean drivers need to deal with page-size complexity. they
 are required to unmap a single page at a time, and iommu_unmap() will
 do the work for them).

It will work with KVM, that is no problem. We don't need to really
enforce the calls to be symetric. But we can define that we only give
the guarantee about what will be unmapped when the calls are symetric.
For example:

iommu_map(  0, 0x10);
iommu_unmap(0, 0x10); /* Guarantee that it will only unmap
 the range 0-0x10 */

whereas:

iommu_map(  0, 0x10);
iommu_unmap(0,   0x1000); /* Guarantees that 0-0x1000 is
 unmapped, but other undefined parts
 of the address space may be
 unmapped too, up to the whole
 address space */

The alternative is that we implement page-splitting in the iommu_unmap
function. But that introduces complexity I am not sure we really need.
KVM for example just unmaps the whole address-space on destruction. For
the generic dma_ops this is also not required because the dma_map*
functions already have the requirement to be symetric.

 
 Another discussion:
 
 I think we better change iommu_ops-map() to directly take a 'size'
 (in bytes) instead of an 'order' (of pages). Most (all?) drivers just
 immediately do 'size = 0x1000UL  gfp_order', so this whole size -
 order - size back and forth seems redundant.


Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue

2011-10-10 Thread Anthony Liguori

On 10/10/2011 10:24 AM, Avi Kivity wrote:

On 10/10/2011 05:12 PM, Anthony Liguori wrote:

On 10/10/2011 10:00 AM, Avi Kivity wrote:

On 10/10/2011 04:55 PM, Anthony Liguori wrote:


Hrm, sorry about that. In the future, it would be helpful to explicitly
withdrawal a PULL request.

Do you want me to revert?


We'll send the revert together with the new patch.


FWIW, I think bumping the version is the right thing to do.


Why?


Because we still haven't fixed subsections. Juan's patches help but they can
still result in sending bad data to and older QEMU that won't handle it 
properly.

We need to figure out a proper fix for subsections, either Paolo's protocol
change or moving subsections out to a detected section.


Ok. So bumping the version is only right if we don't fix subsections.


If we bump *any* version from 0.15 - 1.0, then there's no point at all in 
having a subsection.  If we break compatibility by using Paolo's new protocol, 
or doing subsections as sections, then there's no point in making it a 
subsection either.


Regards,

Anthony Liguori





--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue

2011-10-10 Thread Avi Kivity

On 10/10/2011 05:36 PM, Anthony Liguori wrote:

Ok. So bumping the version is only right if we don't fix subsections.



If we bump *any* version from 0.15 - 1.0, then there's no point at 
all in having a subsection. 


Did we bump versions of relevant devices?

If we break compatibility by using Paolo's new protocol, or doing 
subsections as sections, then there's no point in making it a 
subsection either.


These are workaroundable.  For example if you migrate 0.15 to 1.0 you 
start the destination with -old-subsection-format.  Even if you don't, 
since subsections are rarely present, migration will succeed.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Fix include dependency for mmu_notifier

2011-10-10 Thread Eric B Munson
The kvm_host struct can include an mmu_notifier struct but mmu_notifier.h is
not included directly.

Signed-off-by: Eric B Munson emun...@mgebm.net
---
 include/linux/kvm_host.h |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index eabb21a..d2c79a9 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -14,6 +14,7 @@
 #include linux/signal.h
 #include linux/sched.h
 #include linux/mm.h
+#include linux/mmu_notifier.h
 #include linux/preempt.h
 #include linux/msi.h
 #include linux/slab.h
-- 
1.7.4.1

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] virtio-9p: fix QEMU build break

2011-10-10 Thread Aneesh Kumar K.V
On Mon, 10 Oct 2011 18:30:28 +0800, Zhi Yong Wu wu...@linux.vnet.ibm.com 
wrote:
 qemu build break due to the redefinition of struct file_handle. My 
 qemu.git/HEAD is 8acbc9b21d757a6be4f8492e547b8159703a0547
 
 Below is the log:
 [root@f15 qemu]# make
   CCqapi-generated/qga-qapi-types.o
   LINK  qemu-ga
   CClibhw64/9pfs/virtio-9p-handle.o
 /home/zwu/work/virt/qemu/hw/9pfs/virtio-9p-handle.c:31:8: error: redefinition 
 of struct file_handle
 /usr/include/bits/fcntl.h:254:8: note: originally defined here
 make[1]: *** [9pfs/virtio-9p-handle.o] Error 1
 make: *** [subdir-libhw64] Error 2
 
 [root@f15 qemu]# rpm -qf /usr/include/bits/fcntl.h
 glibc-headers-2.13.90-9.x86_64
 

Is this a backported glibc ? On my ubuntu system glibc 2.13 doesn't
provide struct file_handle. I also checked glib repo at
http://repo.or.cz/w/glibc.git. The commit introducing struct file_handle
is 

$ git describe --contains 158648c0bdda281e252a27c0200dd0ea6f4e0215
glibc-2.14~200


 Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com
 ---
  hw/9pfs/virtio-9p-handle.c |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)
 
 diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c
 index 5c8b5ed..5b3a867 100644
 --- a/hw/9pfs/virtio-9p-handle.c
 +++ b/hw/9pfs/virtio-9p-handle.c
 @@ -27,7 +27,7 @@ struct handle_data {
  int handle_bytes;
  };
 
 -#if __GLIBC__ = 2  __GLIBC_MINOR__  14
 +#if __GLIBC__ = 2  __GLIBC_MINOR__  13
  struct file_handle {
  unsigned int handle_bytes;
  int handle_type;
 -- 

-aneesh
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] virtio-9p: fix QEMU build break

2011-10-10 Thread Aneesh Kumar K.V
On Mon, 10 Oct 2011 22:05:21 +0530, Aneesh Kumar K.V 
aneesh.ku...@linux.vnet.ibm.com wrote:
 On Mon, 10 Oct 2011 18:30:28 +0800, Zhi Yong Wu wu...@linux.vnet.ibm.com 
 wrote:
  qemu build break due to the redefinition of struct file_handle. My 
  qemu.git/HEAD is 8acbc9b21d757a6be4f8492e547b8159703a0547
  
  Below is the log:
  [root@f15 qemu]# make
CCqapi-generated/qga-qapi-types.o
LINK  qemu-ga
CClibhw64/9pfs/virtio-9p-handle.o
  /home/zwu/work/virt/qemu/hw/9pfs/virtio-9p-handle.c:31:8: error: 
  redefinition of struct file_handle
  /usr/include/bits/fcntl.h:254:8: note: originally defined here
  make[1]: *** [9pfs/virtio-9p-handle.o] Error 1
  make: *** [subdir-libhw64] Error 2
  
  [root@f15 qemu]# rpm -qf /usr/include/bits/fcntl.h
  glibc-headers-2.13.90-9.x86_64
  
 
 Is this a backported glibc ? On my ubuntu system glibc 2.13 doesn't
 provide struct file_handle. I also checked glib repo at
 http://repo.or.cz/w/glibc.git. The commit introducing struct file_handle
 is 
 
 $ git describe --contains 158648c0bdda281e252a27c0200dd0ea6f4e0215
 glibc-2.14~200
 
 

How about the below patch. This means that handle driver will only work
with latest glibc. Even if i have latest kernel, with an older glibc
handle fs driver backed will be disabled.

diff --git a/configure b/configure
index 24b8df4..0216c53 100755
--- a/configure
+++ b/configure
@@ -2551,6 +2551,18 @@ EOF
 fi
 
 ##
+# check if we have open_by_handle_at
+
+open_by_hande_at=no
+cat  $TMPC  EOF
+#include fcntl.h
+int main(void) { struct file_handle *fh; open_by_handle_at(0, fh, 0); }
+EOF
+if compile_prog   ; then
+open_by_handle_at=yes
+fi
+
+##
 # End of CC checks
 # After here, no more $cc or $ld runs
 
@@ -3029,6 +3041,10 @@ if test $ucontext_coroutine = yes ; then
   echo CONFIG_UCONTEXT_COROUTINE=y  $config_host_mak
 fi
 
+if test $open_by_handle_at = yes ; then
+  echo CONFIG_OPEN_BY_HANDLE=y  $config_host_mak
+fi
+
 # USB host support
 case $usb in
 linux)
diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c
index 68e1d9b..bd73d31 100644
--- a/hw/9pfs/virtio-9p-handle.c
+++ b/hw/9pfs/virtio-9p-handle.c
@@ -30,13 +30,24 @@ struct handle_data {
 int handle_bytes;
 };
 
-#if __GLIBC__ = 2  __GLIBC_MINOR__  14
+#ifdef CONFIG_OPEN_BY_HANDLE
+static inline int name_to_handle(int dirfd, const char *name,
+ struct file_handle *fh, int *mnt_id, int 
flags)
+{
+return name_to_handle_at(dirfd, name, fh, mnt_id, flags);
+}
+
+static inline int open_by_handle(int mountfd, const char *fh, int flags)
+{
+return open_by_handle_at(mountfd, fh, flags);
+}
+#else
+
 struct file_handle {
-unsigned int handle_bytes;
-int handle_type;
-unsigned char handle[0];
+unsigned int handle_bytes;
+int handle_type;
+unsigned char handle[0];
 };
-#endif
 
 #ifndef AT_EMPTY_PATH
 #define AT_EMPTY_PATH   0x1000  /* Allow empty relative pathname */
@@ -45,28 +56,6 @@ struct file_handle {
 #define O_PATH01000
 #endif
 
-#ifndef __NR_name_to_handle_at
-#if defined(__i386__)
-#define __NR_name_to_handle_at  341
-#define __NR_open_by_handle_at  342
-#elif defined(__x86_64__)
-#define __NR_name_to_handle_at  303
-#define __NR_open_by_handle_at  304
-#endif
-#endif
-
-#ifdef __NR_name_to_handle_at
-static inline int name_to_handle(int dirfd, const char *name,
- struct file_handle *fh, int *mnt_id, int 
flags)
-{
-return syscall(__NR_name_to_handle_at, dirfd, name, fh, mnt_id, flags);
-}
-
-static inline int open_by_handle(int mountfd, const char *fh, int flags)
-{
-return syscall(__NR_open_by_handle_at, mountfd, fh, flags);
-}
-#else
 static inline int name_to_handle(int dirfd, const char *name,
  struct file_handle *fh, int *mnt_id, int 
flags)
 {
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware

2011-10-10 Thread Ohad Ben-Cohen
Hi Joerg,

On Mon, Oct 10, 2011 at 5:36 PM, Roedel, Joerg joerg.roe...@amd.com wrote:
 The master branch is best to base your patches on for generic work.

Oh, great. thanks.

 It can happen when there is a bug somewhere :)

Hmm, bug ? ;)

Ok, I'll add a BUG_ON :)

 Yes, somthing like that. Probably the iommu_ops-unmap function should
 be turned into a unmap_page function call which only takes an iova and
 no size parameter. The iommu-driver unmaps the page pointing to that
 iova and returns the size of the page unmapped. This still allows the
 simple implementation for the unmap-call.

Yes, exactly. It will take some time to migrate all drivers (today we
have 4 drivers, each of which is implementing a slightly different
-unmap() semantics), but at least let's not accept any new driver
that doesn't adhere to this, otherwise it's going to be even harder
for the API to evolve.

 This change is no requirement for this patch-set, but if we agree on it
 this patch-set should keep that direction in mind.

Definitely, thanks.

 We don't need to really
 enforce the calls to be symetric. But we can define that we only give
 the guarantee about what will be unmapped when the calls are symetric.

Sounds good to me. I'll add this to the kernel doc patch (which I'll
submit after this patch set materializes), and when/if we move to
symmetric only, we will update it.

 The alternative is that we implement page-splitting in the iommu_unmap
 function. But that introduces complexity I am not sure we really need.

Yeah, me neither.

 Yes, this get_order thing should be changes to size long-term.

Good. That should be a simple change, I'll do it after this patch set.

Thanks,
Ohad.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks

2011-10-10 Thread Jeremy Fitzhardinge
On 10/10/2011 07:01 AM, Stephan Diestelhorst wrote:
 On Monday 10 October 2011, 07:00:50 Stephan Diestelhorst wrote:
 On Thursday 06 October 2011, 13:40:01 Jeremy Fitzhardinge wrote:
 On 10/06/2011 07:04 AM, Stephan Diestelhorst wrote:
 On Wednesday 28 September 2011, 14:49:56 Linus Torvalds wrote:
 Which certainly should *work*, but from a conceptual standpoint, isn't
 it just *much* nicer to say we actually know *exactly* what the upper
 bits were.
 Well, we really do NOT want atomicity here. What we really rather want
 is sequentiality: free the lock, make the update visible, and THEN
 check if someone has gone sleeping on it.

 Atomicity only conveniently enforces that the three do not happen in a
 different order (with the store becoming visible after the checking
 load).

 This does not have to be atomic, since spurious wakeups are not a
 problem, in particular not with the FIFO-ness of ticket locks.

 For that the fence, additional atomic etc. would be IMHO much cleaner
 than the crazy overflow logic.
 All things being equal I'd prefer lock-xadd just because its easier to
 analyze the concurrency for, crazy overflow tests or no.  But if
 add+mfence turned out to be a performance win, then that would obviously
 tip the scales.

 However, it looks like locked xadd is also has better performance:  on
 my Sandybridge laptop (2 cores, 4 threads), the add+mfence is 20% slower
 than locked xadd, so that pretty much settles it unless you think
 there'd be a dramatic difference on an AMD system.
 Indeed, the fences are usually slower than locked RMWs, in particular,
 if you do not need to add an instruction. I originally missed that
 amazing stunt the GCC pulled off with replacing the branch with carry
 flag magic. It seems that two twisted minds have found each other
 here :)

 One of my concerns was adding a branch in here... so that is settled,
 and if everybody else feels like this is easier to reason about...
 go ahead :) (I'll keep my itch to myself then.)
 Just that I can't... if performance is a concern, adding the LOCK
 prefix to the addb outperforms the xadd significantly:

Hm, yes.  So using the lock prefix on add instead of the mfence?  Hm.

J
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks

2011-10-10 Thread Jeremy Fitzhardinge
On 10/10/2011 12:32 AM, Ingo Molnar wrote:
 * Jeremy Fitzhardinge jer...@goop.org wrote:

 On 10/06/2011 10:40 AM, Jeremy Fitzhardinge wrote:
 However, it looks like locked xadd is also has better performance:  on
 my Sandybridge laptop (2 cores, 4 threads), the add+mfence is 20% slower
 than locked xadd, so that pretty much settles it unless you think
 there'd be a dramatic difference on an AMD system.
 Konrad measures add+mfence is about 65% slower on AMD Phenom as well.
 xadd also results in smaller/tighter code, right?

Not particularly, mostly because of the overflow-into-the-high-part
compensation.  But its only a couple of extra instructions, and no
conditionals, so I don't think it would have any concrete effect.

But, as Stephen points out, perhaps locked add is preferable to locked
xadd, since it also has the same barrier as mfence but has
(significantly!) better performance than either mfence or locked xadd...

J
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: qemudParsePCIDeviceStrs warning

2011-10-10 Thread Krisztián Bánhidy
Hy,
I have some strange entries in syslog, to which I couldnt really find
any information on google. One forum said to disable apparmor, but I
havent any running on system:
root@muramasa:~# grep libvirt /var/log/syslog
Oct 10 12:03:55 muramasa libvirtd: 12:03:55.308: warning :
qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu
probably failed
Oct 10 12:03:55 muramasa libvirtd: 12:03:55.905: warning :
qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu
probably failed
Oct 10 12:03:56 muramasa libvirtd: 12:03:56.545: warning :
qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu
probably failed
Oct 10 12:03:57 muramasa libvirtd: 12:03:57.292: warning :
qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu
probably failed
Oct 10 12:03:57 muramasa libvirtd: 12:03:57.928: warning :
qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu
probably failed

some information about hw:
CPU: AMD Phenom(tm) II X6 1090T Processor
kvm version: QEMU PC emulator version 0.12.5 (qemu-kvm-0.12.5)
on ubuntu squeeze: 2.6.32-5-amd64
arch: x86-64
6 guests with debian squeeze 64bit

guests start after a reboot so not sure why I see these in syslog.
In messages I see one extra line but I am guessing this is from reboot:

Oct 10 10:47:14 muramasa libvirtd: 10:47:14.135: warning :
qemudDispatchSignalEvent:396 : Shutting down on signal 15

Thanks for the info in advance.
Krisztian
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware

2011-10-10 Thread Ohad Ben-Cohen
 On Mon, Oct 10, 2011 at 5:36 PM, Roedel, Joerg joerg.roe...@amd.com wrote:
 The master branch is best to base your patches on for generic work.

It looks like the master branch is missing something like this:

From acb316aa4bcaf383e8cb1580e30c8635e0a34369 Mon Sep 17 00:00:00 2001
From: Ohad Ben-Cohen o...@wizery.com
Date: Mon, 10 Oct 2011 23:55:51 +0200
Subject: [PATCH] iommu/core: fix build issue

Fix this:

drivers/iommu/iommu.c: In function 'iommu_commit':
drivers/iommu/iommu.c:291: error: 'iommu_ops' undeclared (first use in
this function)

Signed-off-by: Ohad Ben-Cohen o...@wizery.com
---
 drivers/iommu/iommu.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 909b0d2..a5131f1 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -288,7 +288,7 @@ EXPORT_SYMBOL_GPL(iommu_unmap);

 void iommu_commit(struct iommu_domain *domain)
 {
-   if (iommu_ops-commit)
-   iommu_ops-commit(domain);
+   if (domain-ops-commit)
+   domain-ops-commit(domain);
 }
 EXPORT_SYMBOL_GPL(iommu_commit);
-- 
1.7.4.1
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware

2011-10-10 Thread Ohad Ben-Cohen
Hi Joerg,

On Mon, Oct 10, 2011 at 5:36 PM, Roedel, Joerg joerg.roe...@amd.com wrote:
 The master branch is best to base your patches on for generic work.

Done. I've revised the patches and attached the main one
below; please tell me if it looks ok, and then I'll resubmit the
entire patch set.

Thanks,
Ohad.

commit bf1d730b5f4f7631becfcd4be52693d85bfea36b
Author: Ohad Ben-Cohen o...@wizery.com
Date:   Mon Oct 10 23:50:55 2011 +0200

iommu/core: split mapping to page sizes as supported by the hardware

When mapping a memory region, split it to page sizes as supported
by the iommu hardware. Always prefer bigger pages, when possible,
in order to reduce the TLB pressure.

The logic to do that is now added to the IOMMU core, so neither the iommu
drivers themselves nor users of the IOMMU API have to duplicate it.

This allows a more lenient granularity of mappings; traditionally the
IOMMU API took 'order' (of a page) as a mapping size, and directly let
the low level iommu drivers handle the mapping, but now that the IOMMU
core can split arbitrary memory regions into pages, we can remove this
limitation, so users don't have to split those regions by themselves.

Currently the supported page sizes are advertised once and they then
remain static. That works well for OMAP and MSM but it would probably
not fly well with intel's hardware, where the page size capabilities
seem to have the potential to be different between several DMA
remapping devices.

register_iommu() currently sets a default pgsize behavior, so we can convert
the IOMMU drivers in subsequent patches, and after all the drivers
are converted, register_iommu will be changed (and the temporary
default settings will be removed).

Mainline users of the IOMMU API (kvm and omap-iovmm) are adopted
to send the mapping size in bytes instead of in page order.

Many thanks to Joerg Roedel joerg.roe...@amd.com for significant review!

Signed-off-by: Ohad Ben-Cohen o...@wizery.com
Cc: David Brown dav...@codeaurora.org
Cc: David Woodhouse dw...@infradead.org
Cc: Joerg Roedel joerg.roe...@amd.com
Cc: Stepan Moskovchenko step...@codeaurora.org
Cc: KyongHo Cho pullip@samsung.com
Cc: Hiroshi DOYU hd...@nvidia.com
Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com
Cc: kvm@vger.kernel.org

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 73778b7..909b0d2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -16,6 +16,8 @@
  * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
  */

+#define pr_fmt(fmt)%s:  fmt, __func__
+
 #include linux/device.h
 #include linux/kernel.h
 #include linux/bug.h
@@ -47,6 +49,19 @@ int bus_set_iommu(struct bus_type *bus, struct
iommu_ops *ops)
if (bus-iommu_ops != NULL)
return -EBUSY;

+   /*
+* Set the default pgsize values, which retain the existing
+* IOMMU API behavior: drivers will be called to map
+* regions that are sized/aligned to order of 4KiB pages.
+*
+* This will be removed once all drivers are migrated.
+*/
+   if (!ops-pgsize_bitmap)
+   ops-pgsize_bitmap = ~0xFFFUL;
+
+   /* find out the minimum page size only once */
+   ops-min_pagesz = 1  __ffs(ops-pgsize_bitmap);
+
bus-iommu_ops = ops;

/* Do IOMMU specific setup for this bus-type */
@@ -157,33 +172,117 @@ int iommu_domain_has_cap(struct iommu_domain *domain,
 EXPORT_SYMBOL_GPL(iommu_domain_has_cap);

 int iommu_map(struct iommu_domain *domain, unsigned long iova,
- phys_addr_t paddr, int gfp_order, int prot)
+ phys_addr_t paddr, int size, int prot)
 {
-   size_t size;
+   unsigned long orig_iova = iova;
+   int ret = 0, orig_size = size;

if (unlikely(domain-ops-map == NULL))
return -ENODEV;

-   size = PAGE_SIZE  gfp_order;
+   /*
+* both the virtual address and the physical one, as well as
+* the size of the mapping, must be aligned (at least) to the
+* size of the smallest page supported by the hardware
+*/
+   if (!IS_ALIGNED(iova | paddr | size, domain-ops-min_pagesz)) {
+   pr_err(unaligned: iova 0x%lx pa 0x%lx size 0x%x min_pagesz 
+   0x%x\n, iova, (unsigned long)paddr,
+   size, domain-ops-min_pagesz);
+   return -EINVAL;
+   }
+
+   pr_debug(map: iova 0x%lx pa 0x%lx size 0x%x\n, iova,
+   (unsigned long)paddr, size);
+
+   while (size) {
+   unsigned long pgsize, addr_merge = iova | paddr;
+   unsigned int pgsize_idx;
+
+   /* Max page size that still fits into 'size' */
+   pgsize_idx = __fls(size);
+
+   /* need to consider alignment requirements ? */
+   if 

Re: kernel BUG at include/linux/kvm_host.h:603!

2011-10-10 Thread Alexander Graf
Hi Jörg,

On 07.10.2011, at 23:10, Jörg Sommer wrote:

 Hi,
 
 I've got this backtrace:
 
 [130902.709711] [ cut here ]
 [130902.709747] kernel BUG at include/linux/kvm_host.h:603!

Ouch. This means that preemption is broken in KVM for PPC. To quickly get 
things working on your side, please recompile your kernel with 
CONFIG_PREEMPT_NONE. I'll take a look at fixing it for real ASAP.


Alex

--
To unsubscribe from this list: send the line unsubscribe kvm-ppc in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html