Re: [PATCH] qemu: Fix inject-nmi
On 09/26/2011 04:21 PM, Avi Kivity wrote: On 09/25/2011 08:22 PM, Jan Kiszka wrote: On 2011-09-25 16:07, Avi Kivity wrote: On 09/23/2011 12:31 PM, Lai Jiangshan wrote: Moreover: wrong indention. You know that this won't work for qemu-kvm with in-kernel irqchip? You may want to provide a patch for that tree, emulating the unavailable LINT1 injection via testing the APIC configration and then raising an NMI as before if it is accepted. It works in my box but the NMI is not injected through the in-kernel irqchip, I will implement it as you suggested. Somewhat hacky; isn't it better to test LINT1 in the kernel (and redefine the KVM_NMI ioctl as toggle LINT1)? KVM_NMI is required for user space IRQ chip as well. We could define KVM_NMI as edging the core NMI input if !irqchip_in_kernel, and toggling LINT1 otherwise. Hardly nice though. The current KVM_NMI with irqchip_in_kernel is not meaningful, since it doesn't obey the rules of any NMI source. Introducing some KVM_SET_LINT1 is an option though. But emulating it for the NMI button on older kernels sounds worthwhile nevertheless. Perhaps this is the best option to avoid confusion. (add cc: seab...@seabios.org) Hi, All, When I was implementing KVM_SET_LINT1, I found many places of the qemu-kvm code need to be changed, and it became nasty. And as Avi said KVM_NMI with irqchip_in_kernel is not meaningful, so KVM_NMI is not used anymore when KVM_SET_LINT1 irqchip_in_kernel, it is dead. Now, we redefine KVM_NMI with more proper meaning, when irqchip_in_kernel, it is kernel/kvm's responsibility to simulate the NMI-injection and set LINT1. When !irqchip_in_kernel, it is userspace's responsibility. It results more real simulation and results simpler code, and it don't need to add new ioctl interface, and it can make use of existing KVM_NMI. Thanks, Lai -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] kernel/kvm: fix improper nmi emulation (was: Re: [Qemu-devel] [PATCH] qemu: Fix inject-nmi)
From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is maskied in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, KVM_NMI ioctl is handled as follows. - When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a request of triggering LINT1 on the processor. LINT1 is emulated in in-kernel irqchip. - When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a request of injecting NMI to the processor. This assumes LINT1 is already emulated in userland. Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com Tested-by: Lai Jiangshan la...@cn.fujitsu.com --- arch/x86/kvm/irq.h |1 + arch/x86/kvm/lapic.c |8 arch/x86/kvm/x86.c | 14 -- 3 files changed, 13 insertions(+), 10 deletions(-) Index: linux/arch/x86/kvm/irq.h === --- linux.orig/arch/x86/kvm/irq.h +++ linux/arch/x86/kvm/irq.h @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu); +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_timers(struct kvm_vcpu *vcpu); Index: linux/arch/x86/kvm/lapic.c === --- linux.orig/arch/x86/kvm/lapic.c +++ linux/arch/x86/kvm/lapic.c @@ -1039,6 +1039,14 @@ void kvm_apic_nmi_wd_deliver(struct kvm_ kvm_apic_local_deliver(apic, APIC_LVT0); } +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + + if (apic) + kvm_apic_local_deliver(apic, APIC_LVT1); +} + static struct kvm_timer_ops lapic_timer_ops = { .is_periodic = lapic_is_periodic, }; Index: linux/arch/x86/kvm/x86.c === --- linux.orig/arch/x86/kvm/x86.c +++ linux/arch/x86/kvm/x86.c @@ -2729,13 +2729,6 @@ static int kvm_vcpu_ioctl_interrupt(stru return 0; } -static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu) -{ - kvm_inject_nmi(vcpu); - - return 0; -} - static int vcpu_ioctl_tpr_access_reporting(struct kvm_vcpu *vcpu, struct kvm_tpr_access_ctl *tac) { @@ -3038,9 +3031,10 @@ long kvm_arch_vcpu_ioctl(struct file *fi break; } case KVM_NMI: { - r = kvm_vcpu_ioctl_nmi(vcpu); - if (r) - goto out; + if (irqchip_in_kernel(vcpu-kvm)) + kvm_apic_lint1_deliver(vcpu); + else + kvm_inject_nmi(vcpu); r = 0; break; } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] qemu-kvm: fix improper nmi emulation (was: Re: [Qemu-devel] [PATCH] qemu: Fix inject-nmi)
From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is maskied in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, inject-nmi request is handled as follows. - When in-kernel irqchip is disabled, inject LINT1 instead of NMI interrupt. - When in-kernel irqchip is enabled, send nmi event to kernel as the current code does. LINT1 should be emulated in kernel. Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com Tested-by: Lai Jiangshan la...@cn.fujitsu.com --- hw/apic.c | 16 hw/apic.h |1 + monitor.c |5 ++--- 3 files changed, 19 insertions(+), 3 deletions(-) Index: qemu-kvm/hw/apic.c === --- qemu-kvm.orig/hw/apic.c +++ qemu-kvm/hw/apic.c @@ -205,6 +205,22 @@ void apic_deliver_pic_intr(DeviceState * } } +void apic_deliver_nmi(CPUState *env) +{ +APICState *apic; + +if (kvm_enabled() kvm_irqchip_in_kernel()) { +cpu_interrupt(env, CPU_INTERRUPT_NMI); + return; +} + +apic = DO_UPCAST(APICState, busdev.qdev, env-apic_state); +if (!apic) +cpu_interrupt(env, CPU_INTERRUPT_NMI); +else +apic_local_deliver(apic, APIC_LVT_LINT1); +} + #define foreach_apic(apic, deliver_bitmask, code) \ {\ int __i, __j, __mask;\ Index: qemu-kvm/hw/apic.h === --- qemu-kvm.orig/hw/apic.h +++ qemu-kvm/hw/apic.h @@ -10,6 +10,7 @@ void apic_deliver_irq(uint8_t dest, uint uint8_t trigger_mode); int apic_accept_pic_intr(DeviceState *s); void apic_deliver_pic_intr(DeviceState *s, int level); +void apic_deliver_nmi(CPUState *env); int apic_get_interrupt(DeviceState *s); void apic_reset_irq_delivered(void); int apic_get_irq_delivered(void); Index: qemu-kvm/monitor.c === --- qemu-kvm.orig/monitor.c +++ qemu-kvm/monitor.c @@ -2615,9 +2615,8 @@ static int do_inject_nmi(Monitor *mon, c { CPUState *env; -for (env = first_cpu; env != NULL; env = env-next_cpu) { -cpu_interrupt(env, CPU_INTERRUPT_NMI); -} +for (env = first_cpu; env != NULL; env = env-next_cpu) + apic_deliver_nmi(env); return 0; } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] seabios: Add Local APIC NMI Structure to ACPI MADT (was: Re: [Qemu-devel] [PATCH] qemu: Fix inject-nmi)
From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com ACPI NMI Structure describes LINT pin (LINT0 or LINT1) information to which NMI is connected, and it is needed by OS to initialize local APIC. Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com Reviewed-by: Lai Jiangshan la...@cn.fujitsu.com --- src/acpi.c | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) Index: seabios/src/acpi.c === --- seabios.orig/src/acpi.c +++ seabios/src/acpi.c @@ -134,6 +134,14 @@ struct madt_intsrcovr { u16 flags; } PACKED; +struct madt_local_nmi { +ACPI_SUB_HEADER_DEF +u8 processor_id; /* ACPI processor id */ +u16 flags; /* MPS INTI flags */ +u8 lint; /* Local APIC LINT# */ +} PACKED; + + /* * ACPI 2.0 Generic Address Space definition. */ @@ -288,7 +296,9 @@ build_madt(void) int madt_size = (sizeof(struct multiple_apic_table) + sizeof(struct madt_processor_apic) * MaxCountCPUs + sizeof(struct madt_io_apic) - + sizeof(struct madt_intsrcovr) * 16); + + sizeof(struct madt_intsrcovr) * 16 + + sizeof(struct madt_local_nmi)); + struct multiple_apic_table *madt = malloc_high(madt_size); if (!madt) { warn_noalloc(); @@ -340,7 +350,15 @@ build_madt(void) intsrcovr++; } -build_header((void*)madt, APIC_SIGNATURE, (void*)intsrcovr - (void*)madt, 1); +struct madt_local_nmi *local_nmi = (void*)intsrcovr; +local_nmi-type = APIC_LOCAL_NMI; +local_nmi-length = sizeof(*local_nmi); +local_nmi-processor_id = 0xff; /* all processors */ +local_nmi-flags= 0; +local_nmi-lint = 1; /* LINT1 */ +local_nmi++; + +build_header((void*)madt, APIC_SIGNATURE, (void*)local_nmi - (void*)madt, 1); return madt; } -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] seabios: fix mptable nmi entry (was: Re: [Qemu-devel] [PATCH] qemu: Fix inject-nmi)
From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com In the current seabios MP table description, NMI is connected only to BSP's LINT1. But usually NMI is connected to all the CPUs' LINT1 as indicated in MP specification. This patch changes seabios MP table to describe NMI is connected to all the CPUs' LINT1. Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com Reviewed-by: Lai Jiangshan la...@cn.fujitsu.com --- src/mptable.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: seabios/src/mptable.c === --- seabios.orig/src/mptable.c +++ seabios/src/mptable.c @@ -169,7 +169,7 @@ mptable_init(void) intsrc-irqflag = 0; /* PO, EL default */ intsrc-srcbus = isabusid; /* ISA */ intsrc-srcbusirq = 0; -intsrc-dstapic = 0; /* BSP == APIC #0 */ +intsrc-dstapic = 0xff; /* to all local APICs */ intsrc-dstirq = 1; /* LINTIN1 */ intsrc++; entrycount += intsrc - intsrcs; -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kernel/kvm: fix improper nmi emulation
On 2011-10-10 08:06, Lai Jiangshan wrote: From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is maskied in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, KVM_NMI ioctl is handled as follows. - When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a request of triggering LINT1 on the processor. LINT1 is emulated in in-kernel irqchip. - When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a request of injecting NMI to the processor. This assumes LINT1 is already emulated in userland. Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com Tested-by: Lai Jiangshan la...@cn.fujitsu.com --- arch/x86/kvm/irq.h |1 + arch/x86/kvm/lapic.c |8 arch/x86/kvm/x86.c | 14 -- 3 files changed, 13 insertions(+), 10 deletions(-) Index: linux/arch/x86/kvm/irq.h === --- linux.orig/arch/x86/kvm/irq.h +++ linux/arch/x86/kvm/irq.h @@ -95,6 +95,7 @@ void kvm_pic_reset(struct kvm_kpic_state void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu); void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu); void kvm_apic_nmi_wd_deliver(struct kvm_vcpu *vcpu); +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu); void __kvm_migrate_apic_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_pit_timer(struct kvm_vcpu *vcpu); void __kvm_migrate_timers(struct kvm_vcpu *vcpu); Index: linux/arch/x86/kvm/lapic.c === --- linux.orig/arch/x86/kvm/lapic.c +++ linux/arch/x86/kvm/lapic.c @@ -1039,6 +1039,14 @@ void kvm_apic_nmi_wd_deliver(struct kvm_ kvm_apic_local_deliver(apic, APIC_LVT0); } +void kvm_apic_lint1_deliver(struct kvm_vcpu *vcpu) +{ + struct kvm_lapic *apic = vcpu-arch.apic; + + if (apic) WARN_ON(!apic)? Looks like that case would be a kernel bug. + kvm_apic_local_deliver(apic, APIC_LVT1); +} + static struct kvm_timer_ops lapic_timer_ops = { .is_periodic = lapic_is_periodic, }; Index: linux/arch/x86/kvm/x86.c === --- linux.orig/arch/x86/kvm/x86.c +++ linux/arch/x86/kvm/x86.c @@ -2729,13 +2729,6 @@ static int kvm_vcpu_ioctl_interrupt(stru return 0; } -static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu) -{ - kvm_inject_nmi(vcpu); - - return 0; -} - static int vcpu_ioctl_tpr_access_reporting(struct kvm_vcpu *vcpu, struct kvm_tpr_access_ctl *tac) { @@ -3038,9 +3031,10 @@ long kvm_arch_vcpu_ioctl(struct file *fi break; } case KVM_NMI: { - r = kvm_vcpu_ioctl_nmi(vcpu); - if (r) - goto out; + if (irqchip_in_kernel(vcpu-kvm)) + kvm_apic_lint1_deliver(vcpu); + else + kvm_inject_nmi(vcpu); r = 0; break; } Looks OK otherwise. Jan signature.asc Description: OpenPGP digital signature
Re: [RFC PATCH 5/7] [hyper-v] hyper-v helper functions
On Sun, 2011-10-09 at 21:01 +0200, Alon Levy wrote: On Sun, Oct 09, 2011 at 08:52:53PM +0200, Vadim Rozenfeld wrote: --- hyperv.c | 44 hyperv.h |7 +++ 2 files changed, 51 insertions(+), 0 deletions(-) diff --git a/hyperv.c b/hyperv.c index a17f879..57915b9 100644 --- a/hyperv.c +++ b/hyperv.c @@ -3,6 +3,10 @@ #include qemu-option.h #include qemu-config.h +static int hyperv_apic; +static int hyperv_wd; +static int hyperv_spinlock_attempts = HYPERV_SPINLOCK_NEVER_RETRY; + void hyperv_init(void) { QemuOpts *opts = QTAILQ_FIRST(qemu_hyperv_opts.head); @@ -10,6 +14,46 @@ void hyperv_init(void) if (!opts) { return; } + +hyperv_spinlock_attempts = qemu_opt_get_number(opts, spinlock, + HYPERV_SPINLOCK_NEVER_RETRY + ); +hyperv_wd = qemu_opt_get_bool(opts, wd, 0); +hyperv_apic = qemu_opt_get_bool(opts, vapic, 0); + +} + +int hyperv_enabled(void) +{ +return hyperv_hypercall_available() | hyperv_relaxed_timing(); Shouldn't this be a logical or? Sure, thanks. +} + +int hyperv_hypercall_available(void) +{ +if (hyperv_apic || +(hyperv_spinlock_attempts != HYPERV_SPINLOCK_NEVER_RETRY)) { + return 1; +} +return 0; +} + +int hyperv_relaxed_timing(void) +{ +return !hyperv_wd; +} + +int hyperv_apic_recommended(void) +{ +#ifdef KVM_CAP_IRQCHIP +return hyperv_apic; +#else +return 0; +#endif +} + +int hyperv_spinlock_retries(void) +{ +return hyperv_spinlock_attempts; } static void hyperv_initialize(void) diff --git a/hyperv.h b/hyperv.h index eaf974a..27d2e6e 100644 --- a/hyperv.h +++ b/hyperv.h @@ -6,7 +6,14 @@ #include asm/hyperv.h +#define HYPERV_SPINLOCK_NEVER_RETRY 0x + void hyperv_init(void); +int hyperv_enabled(void); +int hyperv_hypercall_available(void); +int hyperv_relaxed_timing(void); +int hyperv_apic_recommended(void); +int hyperv_spinlock_retries(void); #endif /* QEMU_HW_HYPERV_H */ -- 1.7.4.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] qemu-kvm: fix improper nmi emulation
On 2011-10-10 08:06, Lai Jiangshan wrote: From: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is maskied in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, inject-nmi request is handled as follows. - When in-kernel irqchip is disabled, inject LINT1 instead of NMI interrupt. - When in-kernel irqchip is enabled, send nmi event to kernel as the current code does. LINT1 should be emulated in kernel. Signed-off-by: Kenji Kaneshige kaneshige.ke...@jp.fujitsu.com Tested-by: Lai Jiangshan la...@cn.fujitsu.com This is targeting uq/master? Please make sure your patch passes checkpatch.pl --- hw/apic.c | 16 hw/apic.h |1 + monitor.c |5 ++--- 3 files changed, 19 insertions(+), 3 deletions(-) Index: qemu-kvm/hw/apic.c === --- qemu-kvm.orig/hw/apic.c +++ qemu-kvm/hw/apic.c @@ -205,6 +205,22 @@ void apic_deliver_pic_intr(DeviceState * } } +void apic_deliver_nmi(CPUState *env) +{ +APICState *apic; + +if (kvm_enabled() kvm_irqchip_in_kernel()) { +cpu_interrupt(env, CPU_INTERRUPT_NMI); + return; +} + +apic = DO_UPCAST(APICState, busdev.qdev, env-apic_state); +if (!apic) +cpu_interrupt(env, CPU_INTERRUPT_NMI); Testing for !apic and handling the non-APIC case here looks a bit strange. Let's move the !env-apic_state test to the caller to make it consistent with other APIC services. The KVM case should be a separate qemu-kvm patch on top for now. (We may implement calls into APIC models differently when pushing in-kernel irqchip support upstream.) Jan signature.asc Description: OpenPGP digital signature
Re: [RFC PATCH 0/7] Initial support for Microsoft Hyper-V
On 2011-10-09 20:52, Vadim Rozenfeld wrote: Enable some basic Hyper-V enlightenment functionalites, including relaxed timing, spinlock, and virtual APIC. This targets uq/master, correct? Then you should CC qemu-devel on the next round. I think this series could also be distributed over 3 or 4 patches without loosing bisectability. And please spend a bit time on commit logs. Jan signature.asc Description: OpenPGP digital signature
Re: [RFC PATCH 1/7] [hyper-v] Add hyper-v parameters block.
On 2011-10-09 20:52, Vadim Rozenfeld wrote: --- qemu-options.hx | 23 +++ vl.c|2 ++ 2 files changed, 25 insertions(+), 0 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index 3a13533..9f60059 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2483,6 +2483,29 @@ DEF(kvm-shadow-memory, HAS_ARG, QEMU_OPTION_kvm_shadow_memory, allocate MEGABYTES for kvm mmu shadowing\n, QEMU_ARCH_I386) +DEF(hyperv, HAS_ARG, QEMU_OPTION_hyperv, +-hyperv [vapic=on|off][,spinlock=retries][,wd=on|off]\n +enable Hyper-V Enlightenment\n, +QEMU_ARCH_ALL) These are CPU feature, so -cpu +/-hv_vapic,+/-hv_spinlock etc. looks more appropriate than a new command line parameter. BTW, documentation and maybe also option processing should make clear that this is limited to KVM mode for now. Jan signature.asc Description: OpenPGP digital signature
Re: [RFC PATCH 3/7] [hyper-v] make Hyper-V option configurable.
On 2011-10-09 20:52, Vadim Rozenfeld wrote: --- Makefile.target |1 + configure | 11 +++ 2 files changed, 12 insertions(+), 0 deletions(-) diff --git a/Makefile.target b/Makefile.target index f84d8cb..3581480 100644 --- a/Makefile.target +++ b/Makefile.target @@ -199,6 +199,7 @@ obj-$(CONFIG_VHOST_NET) += vhost.o obj-$(CONFIG_REALLY_VIRTFS) += 9pfs/virtio-9p-device.o obj-$(CONFIG_KVM) += kvm.o kvm-all.o obj-$(CONFIG_NO_KVM) += kvm-stub.o +obj-$(CONFIG_HYPERV) += hyperv.o obj-y += memory.o LIBS+=-lz diff --git a/configure b/configure index 94c7d31..f5ecfd7 100755 --- a/configure +++ b/configure @@ -150,6 +150,7 @@ debug=no strip_opt=yes bigendian=no mingw32=no +hyperv=no EXESUF= prefix=/usr/local mandir=\${prefix}/share/man @@ -762,6 +763,10 @@ for opt do ;; --enable-vhost-net) vhost_net=yes ;; + --disable-hyperv) hyperv=no + ;; + --enable-hyperv) hyperv=yes + ;; --disable-opengl) opengl=no ;; --enable-opengl) opengl=yes @@ -1062,6 +1067,8 @@ echo --enable-docsenable documentation build echo --disable-docs disable documentation build echo --disable-vhost-net disable vhost-net acceleration support echo --enable-vhost-net enable vhost-net acceleration support +echo --enable-hyperv enable Hyper-V support +echo --disable-hyperv disable Hyper-V support echo --enable-trace-backend=B Set trace backend echoAvailable backends: $($source_path/scripts/tracetool --list-backends) echo --with-trace-file=NAME Full PATH,NAME of file to store traces @@ -2737,6 +2744,7 @@ echo madvise $madvise echo posix_madvise $posix_madvise echo uuid support $uuid echo vhost-net support $vhost_net +echo Hyper-V support $hyperv echo Trace backend $trace_backend echo Trace output file $trace_file-pid echo spice support $spice @@ -3424,6 +3432,9 @@ case $target_arch2 in if test $kvm_cap_device_assignment = yes ; then echo CONFIG_KVM_DEVICE_ASSIGNMENT=y $config_target_mak fi + if test $hyperv = yes ; then +echo CONFIG_HYPERV=y $config_target_mak + fi fi esac if test $target_bigendian = yes ; then Why do I want to --disable-hyperv? It rather looks like we could perfectly live with this feature built by default. Would also allow to drop the nasty #ifdefs from the code. Jan signature.asc Description: OpenPGP digital signature
Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware
On Sun, Oct 2, 2011 at 5:58 PM, Ohad Ben-Cohen o...@wizery.com wrote: Ok, fair enough. I've revised the patches and attached the main one below; please tell me if it looks ok, and then I'll resubmit the entire patch set. Ping ? -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks
* Jeremy Fitzhardinge jer...@goop.org wrote: On 10/06/2011 10:40 AM, Jeremy Fitzhardinge wrote: However, it looks like locked xadd is also has better performance: on my Sandybridge laptop (2 cores, 4 threads), the add+mfence is 20% slower than locked xadd, so that pretty much settles it unless you think there'd be a dramatic difference on an AMD system. Konrad measures add+mfence is about 65% slower on AMD Phenom as well. xadd also results in smaller/tighter code, right? Thanks, Ingo -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: kernel BUG at include/linux/kvm_host.h:603!
Hi Jörg, On 07.10.2011, at 23:10, Jörg Sommer wrote: Hi, I've got this backtrace: [130902.709711] [ cut here ] [130902.709747] kernel BUG at include/linux/kvm_host.h:603! Ouch. This means that preemption is broken in KVM for PPC. To quickly get things working on your side, please recompile your kernel with CONFIG_PREEMPT_NONE. I'll take a look at fixing it for real ASAP. Alex -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 0/7] Initial support for Microsoft Hyper-V
On Mon, 2011-10-10 at 08:53 +0200, Jan Kiszka wrote: On 2011-10-09 20:52, Vadim Rozenfeld wrote: Enable some basic Hyper-V enlightenment functionalites, including relaxed timing, spinlock, and virtual APIC. This targets uq/master, correct? Then you should CC qemu-devel on the next round. I think this series could also be distributed over 3 or 4 patches without loosing bisectability. And please spend a bit time on commit logs. OK. Jan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 1/7] [hyper-v] Add hyper-v parameters block.
On Mon, 2011-10-10 at 08:54 +0200, Jan Kiszka wrote: On 2011-10-09 20:52, Vadim Rozenfeld wrote: --- qemu-options.hx | 23 +++ vl.c|2 ++ 2 files changed, 25 insertions(+), 0 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index 3a13533..9f60059 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2483,6 +2483,29 @@ DEF(kvm-shadow-memory, HAS_ARG, QEMU_OPTION_kvm_shadow_memory, allocate MEGABYTES for kvm mmu shadowing\n, QEMU_ARCH_I386) +DEF(hyperv, HAS_ARG, QEMU_OPTION_hyperv, +-hyperv [vapic=on|off][,spinlock=retries][,wd=on|off]\n +enable Hyper-V Enlightenment\n, +QEMU_ARCH_ALL) These are CPU feature, so -cpu +/-hv_vapic,+/-hv_spinlock etc. looks more appropriate than a new command line parameter. I would like to keep hyper-v settings apart from cpu features for a very simple reason: if hyper-v VMBus support will be added one day, it won't be a CPU only feature anymore. BTW, documentation and maybe also option processing should make clear that this is limited to KVM mode for now. Will add it. Vadim Jan -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] qemu-kvm: Deprecate drive parameter boot=on|off
On 10/08/2011 09:46 AM, Jan Kiszka wrote: We do not want to maintain this option forever. It will be removed after a grace period of a few releases. So warn the user that this option has no effect and will become invalid soon. Thanks, applied. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] qemu-kvm: fix improper nmi emulation
Am 10.10.2011 08:49, schrieb Jan Kiszka: On 2011-10-10 08:06, Lai Jiangshan wrote: From: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is maskied in LVT. For example, this [...] This is targeting uq/master? Please make sure your patch passes checkpatch.pl While at it: masked? Andreas -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware
Hi Ohad, sorry, I was on vacation last week and had no time to look into this. On Sun, Oct 02, 2011 at 11:58:12AM -0400, Ohad Ben-Cohen wrote: drivers/iommu/iommu.c | 138 --- drivers/iommu/omap-iovmm.c | 12 +--- include/linux/iommu.h |6 +- virt/kvm/iommu.c |4 +- 4 files changed, 137 insertions(+), 23 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index a7b0862..f23563f 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -16,6 +16,8 @@ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ +#define pr_fmt(fmt)%s: fmt, __func__ + #include linux/kernel.h #include linux/bug.h #include linux/types.h @@ -23,15 +25,54 @@ #include linux/slab.h #include linux/errno.h #include linux/iommu.h +#include linux/bitmap.h Is this still required? static struct iommu_ops *iommu_ops; +/* bitmap of supported page sizes */ +static unsigned long iommu_pgsize_bitmap; + +/* size of the smallest supported page (in bytes) */ +static unsigned int iommu_min_pagesz; + +/** + * register_iommu() - register an IOMMU hardware + * @ops: iommu handlers + * @pgsize_bitmap: bitmap of page sizes supported by the hardware + * + * Note: this is a temporary function, which will be removed once + * all IOMMU drivers are converted. The only reason it exists is to + * allow splitting the pgsizes changes to several patches in order to ease + * the review. + */ +void register_iommu_pgsize(struct iommu_ops *ops, unsigned long pgsize_bitmap) +{ + if (iommu_ops || iommu_pgsize_bitmap || !pgsize_bitmap) + BUG(); + + iommu_ops = ops; + iommu_pgsize_bitmap = pgsize_bitmap; + + /* find out the minimum page size only once */ + iommu_min_pagesz = 1 __ffs(pgsize_bitmap); +} Hmm, I thought a little bit about that and came to the conculusion it might be best to just keep the page-sizes as a part of the iommu_ops structure. So there is no need to extend the register_iommu interface. Also, the bus_set_iommu interface is now in the -next branch. Would be good if you rebase the patches to that interface. You can find the current iommu tree with these changes at git://git.8bytes.org/scm/iommu.git @@ -115,26 +156,103 @@ int iommu_domain_has_cap(struct iommu_domain *domain, EXPORT_SYMBOL_GPL(iommu_domain_has_cap); int iommu_map(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, int gfp_order, int prot) + phys_addr_t paddr, size_t size, int prot) { - size_t size; + int ret = 0; + + /* +* both the virtual address and the physical one, as well as +* the size of the mapping, must be aligned (at least) to the +* size of the smallest page supported by the hardware +*/ + if (!IS_ALIGNED(iova | paddr | size, iommu_min_pagesz)) { + pr_err(unaligned: iova 0x%lx pa 0x%lx size 0x%lx min_pagesz + 0x%x\n, iova, (unsigned long)paddr, + (unsigned long)size, iommu_min_pagesz); + return -EINVAL; + } - size = 0x1000UL gfp_order; + pr_debug(map: iova 0x%lx pa 0x%lx size 0x%lx\n, iova, + (unsigned long)paddr, (unsigned long)size); - BUG_ON(!IS_ALIGNED(iova | paddr, size)); + while (size) { + unsigned long pgsize, addr_merge = iova | paddr; + unsigned int pgsize_idx; - return iommu_ops-map(domain, iova, paddr, gfp_order, prot); + /* Max page size that still fits into 'size' */ + pgsize_idx = __fls(size); + + /* need to consider alignment requirements ? */ + if (likely(addr_merge)) { + /* Max page size allowed by both iova and paddr */ + unsigned int align_pgsize_idx = __ffs(addr_merge); + + pgsize_idx = min(pgsize_idx, align_pgsize_idx); + } + + /* build a mask of acceptable page sizes */ + pgsize = (1UL (pgsize_idx + 1)) - 1; + + /* throw away page sizes not supported by the hardware */ + pgsize = iommu_pgsize_bitmap; I think we need some care here and check pgsize for 0. A BUG_ON should do. + + /* pick the biggest page */ + pgsize_idx = __fls(pgsize); + pgsize = 1UL pgsize_idx; + + /* convert index to page order */ + pgsize_idx -= PAGE_SHIFT; + + pr_debug(mapping: iova 0x%lx pa 0x%lx order %u\n, iova, + (unsigned long)paddr, pgsize_idx); + + ret = iommu_ops-map(domain, iova, paddr, pgsize_idx, prot); + if (ret) + break;
Re: [RFC PATCH 1/7] [hyper-v] Add hyper-v parameters block.
On 2011-10-10 11:40, Vadim Rozenfeld wrote: On Mon, 2011-10-10 at 08:54 +0200, Jan Kiszka wrote: On 2011-10-09 20:52, Vadim Rozenfeld wrote: --- qemu-options.hx | 23 +++ vl.c|2 ++ 2 files changed, 25 insertions(+), 0 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index 3a13533..9f60059 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -2483,6 +2483,29 @@ DEF(kvm-shadow-memory, HAS_ARG, QEMU_OPTION_kvm_shadow_memory, allocate MEGABYTES for kvm mmu shadowing\n, QEMU_ARCH_I386) +DEF(hyperv, HAS_ARG, QEMU_OPTION_hyperv, +-hyperv [vapic=on|off][,spinlock=retries][,wd=on|off]\n +enable Hyper-V Enlightenment\n, +QEMU_ARCH_ALL) These are CPU feature, so -cpu +/-hv_vapic,+/-hv_spinlock etc. looks more appropriate than a new command line parameter. I would like to keep hyper-v settings apart from cpu features for a very simple reason: if hyper-v VMBus support will be added one day, it won't be a CPU only feature anymore. Then that feature would be controlled by adding the corresponding device. There is no need for -hyperv. Jan -- Siemens AG, Corporate Technology, CT T DE IT 1 Corporate Competence Center Embedded Linux -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] kernel/kvm: fix improper nmi emulation
On 10/10/2011 08:06 AM, Lai Jiangshan wrote: From: Kenji Kaneshigekaneshige.ke...@jp.fujitsu.com Currently, NMI interrupt is blindly sent to all the vCPUs when NMI button event happens. This doesn't properly emulate real hardware on which NMI button event triggers LINT1. Because of this, NMI is sent to the processor even when LINT1 is maskied in LVT. For example, this causes the problem that kdump initiated by NMI sometimes doesn't work on KVM, because kdump assumes NMI is masked on CPUs other than CPU0. With this patch, KVM_NMI ioctl is handled as follows. - When in-kernel irqchip is enabled, KVM_NMI ioctl is handled as a request of triggering LINT1 on the processor. LINT1 is emulated in in-kernel irqchip. - When in-kernel irqchip is disabled, KVM_NMI ioctl is handled as a request of injecting NMI to the processor. This assumes LINT1 is already emulated in userland. Please add a KVM_NMI section to Documentation/virtual/kvm/api.txt. -static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu) -{ - kvm_inject_nmi(vcpu); - - return 0; -} - static int vcpu_ioctl_tpr_access_reporting(struct kvm_vcpu *vcpu, struct kvm_tpr_access_ctl *tac) { @@ -3038,9 +3031,10 @@ long kvm_arch_vcpu_ioctl(struct file *fi break; } case KVM_NMI: { - r = kvm_vcpu_ioctl_nmi(vcpu); - if (r) - goto out; + if (irqchip_in_kernel(vcpu-kvm)) + kvm_apic_lint1_deliver(vcpu); + else + kvm_inject_nmi(vcpu); r = 0; break; } Why did you drop kvm_vcpu_ioctl_nmi()? Please add (and document) a KVM_CAP flag that lets userspace know the new behaviour is supported. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] [kvm-autotest] cgroup-kvm: add_*_drive / rm_drive
This is useful function. This function can be in kvm utils. - Original Message - * functions for adding and removal of drive to vm using host-file or host-scsi_debug device. Signed-off-by: Lukas Doktor ldok...@redhat.com --- client/tests/kvm/tests/cgroup.py | 125 - 1 files changed, 108 insertions(+), 17 deletions(-) diff --git a/client/tests/kvm/tests/cgroup.py b/client/tests/kvm/tests/cgroup.py index b9a10ea..d6418b5 100644 --- a/client/tests/kvm/tests/cgroup.py +++ b/client/tests/kvm/tests/cgroup.py @@ -17,6 +17,108 @@ def run_cgroup(test, params, env): vms = None tests = None +# Func +def get_device_driver(): + +Discovers the used block device driver {ide, scsi, virtio_blk} +@return: Used block device driver {ide, scsi, virtio} + +if test.tagged_testname.count('virtio_blk'): +return virtio +elif test.tagged_testname.count('scsi'): +return scsi +else: +return ide + + +def add_file_drive(vm, driver=get_device_driver(), host_file=None): + +Hot-add a drive based on file to a vm +@param vm: Desired VM +@param driver: which driver should be used (default: same as in test) +@param host_file: Which file on host is the image (default: create new) +@return: Tupple(ret_file, device) +ret_file: created file handler (None if not created) +device: PCI id of the virtual disk + +if not host_file: +host_file = tempfile.NamedTemporaryFile(prefix=cgroup-disk-, + suffix=.iso) +utils.system(dd if=/dev/zero of=%s bs=1M count=8 /dev/null + % (host_file.name)) +ret_file = host_file +else: +ret_file = None + +out = vm.monitor.cmd(pci_add auto storage file=%s,if=%s,snapshot=off, + cache=off % (host_file.name, driver)) +dev = re.search(r'OK domain (\d+), bus (\d+), slot (\d+), function \d+', +out) +if not dev: +raise error.TestFail(Can't add device(%s, %s, %s): %s % (vm, +host_file.name, driver, out)) +device = %s:%s:%s % dev.groups() +return (ret_file, device) + + +def add_scsi_drive(vm, driver=get_device_driver(), host_file=None): + +Hot-add a drive based on scsi_debug device to a vm +@param vm: Desired VM +@param driver: which driver should be used (default: same as in test) +@param host_file: Which dev on host is the image (default: create new) +@return: Tupple(ret_file, device) +ret_file: string of the created dev (None if not created) +device: PCI id of the virtual disk + +if not host_file: +if utils.system_output(lsmod | grep scsi_debug -c) == 0: +utils.system(modprobe scsi_debug dev_size_mb=8 add_host=0) +utils.system(echo 1 /sys/bus/pseudo/drivers/scsi_debug/add_host) +host_file = utils.system_output(ls /dev/sd* | tail -n 1) +# Enable idling in scsi_debug drive +utils.system(echo 1 /sys/block/%s/queue/rotational % host_file) +ret_file = host_file +else: +# Don't remove this device during cleanup +# Reenable idling in scsi_debug drive (in case it's not) +utils.system(echo 1 /sys/block/%s/queue/rotational % host_file) +ret_file = None + +out = vm.monitor.cmd(pci_add auto storage file=%s,if=%s,snapshot=off, + cache=off % (host_file, driver)) +dev = re.search(r'OK domain (\d+), bus (\d+), slot (\d+), function \d+', +out) +if not dev: +raise error.TestFail(Can't add device(%s, %s, %s): %s % (vm, +host_file, driver, out)) +device = %s:%s:%s % dev.groups() +return (ret_file, device) + + +def rm_drive(vm, host_file, device): + +Remove drive from vm and device on disk +! beware to remove scsi devices in reverse order ! + +vm.monitor.cmd(pci_del %s % device) + +if isinstance(host_file, file): # file +host_file.close() +elif isinstance(host_file, str):# scsi device +utils.system(echo -1 /sys/bus/pseudo/drivers/scsi_debug/add_host) +else:# custom file, do nothing +pass + +def get_all_pids(ppid): + +Get all PIDs of children/threads of parent ppid +param ppid: parent PID +return: list of PIDs
Re: [PATCH] apic: test tsc deadline timer
On 10/09/2011 05:32 PM, Liu, Jinsong wrote: Updated test case for kvm tsc deadline timer https://github.com/avikivity/kvm-unit-tests, as attached. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Update README example
On 10/09/2011 06:02 PM, Liu, Jinsong wrote: Subject: [PATCH] Update README example Thanks, applied. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] virtio-9p: fix QEMU build break
qemu build break due to the redefinition of struct file_handle. My qemu.git/HEAD is 8acbc9b21d757a6be4f8492e547b8159703a0547 Below is the log: [root@f15 qemu]# make CCqapi-generated/qga-qapi-types.o LINK qemu-ga CClibhw64/9pfs/virtio-9p-handle.o /home/zwu/work/virt/qemu/hw/9pfs/virtio-9p-handle.c:31:8: error: redefinition of struct file_handle /usr/include/bits/fcntl.h:254:8: note: originally defined here make[1]: *** [9pfs/virtio-9p-handle.o] Error 1 make: *** [subdir-libhw64] Error 2 [root@f15 qemu]# rpm -qf /usr/include/bits/fcntl.h glibc-headers-2.13.90-9.x86_64 Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- hw/9pfs/virtio-9p-handle.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c index 5c8b5ed..5b3a867 100644 --- a/hw/9pfs/virtio-9p-handle.c +++ b/hw/9pfs/virtio-9p-handle.c @@ -27,7 +27,7 @@ struct handle_data { int handle_bytes; }; -#if __GLIBC__ = 2 __GLIBC_MINOR__ 14 +#if __GLIBC__ = 2 __GLIBC_MINOR__ 13 struct file_handle { unsigned int handle_bytes; int handle_type; -- 1.7.6 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][uq/master] kvm: Add top-like kvm statistics script
On 10/07/2011 09:37 AM, Jan Kiszka wrote: Taken from original qemu-kvm/kvm/kvm_stat. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH][uq/master] kvm: Add tool for querying VMX capabilities
On 10/07/2011 09:37 AM, Jan Kiszka wrote: Taken from original qemu-kvm/kvm/scripts/vmxcap. Applied, thanks. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 0/9] perf support for x86 guest/host-only bits
Hi Gleb, On Wed, Oct 05, 2011 at 08:01:15AM -0400, Gleb Natapov wrote: This patch series consists of Joerg series named perf support for amd guest/host-only bits v2 [1] rebased to 3.1.0-rc7 and in addition, support for intel cpus for the same functionality. [1] https://lkml.org/lkml/2011/6/17/171 Changelog: v1-v2 - move perf_guest_switch_msr array to perf code. - small cosmetic changes. Gleb Natapov (4): perf, intel: Use GO/HO bits in perf-ctr KVM, VMX: add support for switching of PERF_GLOBAL_CTRL KVM, VMX: Add support for guest/host-only profiling KVM, VMX: Check for automatic switch msr table overflow. Joerg Roedel (5): perf, core: Introduce attrs to count in either host or guest mode perf, amd: Use GO/HO bits in perf-ctr perf, tools: Add support for guest/host-only profiling perf, tools: Fix copypaste error in perf-kvm option description perf, tools: Do guest-only counting in perf-kvm by default arch/x86/include/asm/perf_event.h | 15 arch/x86/kernel/cpu/perf_event.c | 14 arch/x86/kernel/cpu/perf_event_amd.c | 13 +++ arch/x86/kernel/cpu/perf_event_intel.c | 90 +- arch/x86/kvm/vmx.c | 131 +--- include/linux/perf_event.h |5 +- tools/perf/builtin-kvm.c |5 +- tools/perf/util/event.c|8 ++ tools/perf/util/event.h|2 + tools/perf/util/evlist.c |5 +- tools/perf/util/parse-events.c | 15 +++- 11 files changed, 282 insertions(+), 21 deletions(-) Many thanks for picking this up :) Joerg -- AMD Operating System Research Center Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach General Managers: Alberto Bozzo, Andrew Bowd Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/4] [kvm-autotest] cgroup-kvm: add_*_drive / rm_drive
I thought about that. But pci_add is not much stable and it's not supported in QMP (as far as I read) with a note that this way is buggy and should be rewritten completely. So I placed it here to let it develop and then I can move it into utils. Regards, Lukáš Dne 10.10.2011 12:26, Jiri Zupka napsal(a): This is useful function. This function can be in kvm utils. - Original Message - * functions for adding and removal of drive to vm using host-file or host-scsi_debug device. Signed-off-by: Lukas Doktorldok...@redhat.com --- client/tests/kvm/tests/cgroup.py | 125 - 1 files changed, 108 insertions(+), 17 deletions(-) diff --git a/client/tests/kvm/tests/cgroup.py b/client/tests/kvm/tests/cgroup.py index b9a10ea..d6418b5 100644 --- a/client/tests/kvm/tests/cgroup.py +++ b/client/tests/kvm/tests/cgroup.py @@ -17,6 +17,108 @@ def run_cgroup(test, params, env): vms = None tests = None +# Func +def get_device_driver(): + +Discovers the used block device driver {ide, scsi, virtio_blk} +@return: Used block device driver {ide, scsi, virtio} + +if test.tagged_testname.count('virtio_blk'): +return virtio +elif test.tagged_testname.count('scsi'): +return scsi +else: +return ide + + +def add_file_drive(vm, driver=get_device_driver(), host_file=None): + +Hot-add a drive based on file to a vm +@param vm: Desired VM +@param driver: which driver should be used (default: same as in test) +@param host_file: Which file on host is the image (default: create new) +@return: Tupple(ret_file, device) +ret_file: created file handler (None if not created) +device: PCI id of the virtual disk + +if not host_file: +host_file = tempfile.NamedTemporaryFile(prefix=cgroup-disk-, + suffix=.iso) +utils.system(dd if=/dev/zero of=%s bs=1M count=8 /dev/null + % (host_file.name)) +ret_file = host_file +else: +ret_file = None + +out = vm.monitor.cmd(pci_add auto storage file=%s,if=%s,snapshot=off, + cache=off % (host_file.name, driver)) +dev = re.search(r'OK domain (\d+), bus (\d+), slot (\d+), function \d+', +out) +if not dev: +raise error.TestFail(Can't add device(%s, %s, %s): %s % (vm, +host_file.name, driver, out)) +device = %s:%s:%s % dev.groups() +return (ret_file, device) + + +def add_scsi_drive(vm, driver=get_device_driver(), host_file=None): + +Hot-add a drive based on scsi_debug device to a vm +@param vm: Desired VM +@param driver: which driver should be used (default: same as in test) +@param host_file: Which dev on host is the image (default: create new) +@return: Tupple(ret_file, device) +ret_file: string of the created dev (None if not created) +device: PCI id of the virtual disk + +if not host_file: +if utils.system_output(lsmod | grep scsi_debug -c) == 0: +utils.system(modprobe scsi_debug dev_size_mb=8 add_host=0) +utils.system(echo 1 /sys/bus/pseudo/drivers/scsi_debug/add_host) +host_file = utils.system_output(ls /dev/sd* | tail -n 1) +# Enable idling in scsi_debug drive +utils.system(echo 1 /sys/block/%s/queue/rotational % host_file) +ret_file = host_file +else: +# Don't remove this device during cleanup +# Reenable idling in scsi_debug drive (in case it's not) +utils.system(echo 1 /sys/block/%s/queue/rotational % host_file) +ret_file = None + +out = vm.monitor.cmd(pci_add auto storage file=%s,if=%s,snapshot=off, + cache=off % (host_file, driver)) +dev = re.search(r'OK domain (\d+), bus (\d+), slot (\d+), function \d+', +out) +if not dev: +raise error.TestFail(Can't add device(%s, %s, %s): %s % (vm, +host_file, driver, out)) +device = %s:%s:%s % dev.groups() +return (ret_file, device) + + +def rm_drive(vm, host_file, device): + +Remove drive from vm and device on disk +! beware to remove scsi devices in reverse order ! + +vm.monitor.cmd(pci_del %s % device) + +if isinstance(host_file, file): # file +host_file.close() +elif isinstance(host_file, str):# scsi device +utils.system(echo -1 /sys/bus/pseudo/drivers/scsi_debug/add_host) +else:# custom file, do nothing +
Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks
On Thursday 06 October 2011, 13:40:01 Jeremy Fitzhardinge wrote: On 10/06/2011 07:04 AM, Stephan Diestelhorst wrote: On Wednesday 28 September 2011, 14:49:56 Linus Torvalds wrote: Which certainly should *work*, but from a conceptual standpoint, isn't it just *much* nicer to say we actually know *exactly* what the upper bits were. Well, we really do NOT want atomicity here. What we really rather want is sequentiality: free the lock, make the update visible, and THEN check if someone has gone sleeping on it. Atomicity only conveniently enforces that the three do not happen in a different order (with the store becoming visible after the checking load). This does not have to be atomic, since spurious wakeups are not a problem, in particular not with the FIFO-ness of ticket locks. For that the fence, additional atomic etc. would be IMHO much cleaner than the crazy overflow logic. All things being equal I'd prefer lock-xadd just because its easier to analyze the concurrency for, crazy overflow tests or no. But if add+mfence turned out to be a performance win, then that would obviously tip the scales. However, it looks like locked xadd is also has better performance: on my Sandybridge laptop (2 cores, 4 threads), the add+mfence is 20% slower than locked xadd, so that pretty much settles it unless you think there'd be a dramatic difference on an AMD system. Indeed, the fences are usually slower than locked RMWs, in particular, if you do not need to add an instruction. I originally missed that amazing stunt the GCC pulled off with replacing the branch with carry flag magic. It seems that two twisted minds have found each other here :) One of my concerns was adding a branch in here... so that is settled, and if everybody else feels like this is easier to reason about... go ahead :) (I'll keep my itch to myself then.) Stephan -- Stephan Diestelhorst, AMD Operating System Research Center stephan.diestelho...@amd.com, Tel. +49 (0)351 448 356 719 Advanced Micro Devices GmbH Einsteinring 24 85609 Aschheim Germany Geschaeftsfuehrer: Alberto Bozzo; Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632, WEEE-Reg-Nr: DE 12919551 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
KVM call agenda for October 11th
Hi Please send in any agenda items you are interested in covering. Thanks, Juan. pgp2ZkeuIbtbB.pgp Description: PGP signature
Re: [kvm] Re: tcpdump locks up kvm host for a while.
On 10/05/2011 10:29 PM, Robin Lee Powell wrote: # # (For a higher level overview, try: perf report --sort comm,dso) # How helpful is that? -_- I'm guessing I need --guestkallsyms= ; since they're all the same kernel I thought it'd figure it out. I'll redo. OK, here's a better version. # Events: 46K cycles # # Overhead CommandShared Object Symbol # ... ... # 74.81% qemu-kvm [unknown][u] 0x7fbdffd4c18a This is in userspace, so it seems the guest wasn't completely stuck. Try 'top -b' inside the guest to record what happens, let's see what processes this is and go from there. 25.14% qemu-kvm [guest.kernel.kallsyms] [g] 0x82f0 This doesn't resolve, please make sure the kernel-debuginfo package is installed in the guest and use the guestmount option. (or you can install it in the host, I think) -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] pci-assign: Fix MSI-X registration
On 09/22/2011 12:04 PM, Jan Kiszka wrote: goto out; +if (!kvm_check_extension(kvm_state, KVM_CAP_ASSIGN_DEV_IRQ) +(dev-cap.available ASSIGNED_DEVICE_CAP_MSIX || + dev-cap.available ASSIGNED_DEVICE_CAP_MSI || + assigned_dev_pci_read_byte(pci_dev, PCI_INTERRUPT_PIN) != 0)) { +goto out; +} + That's not equivalent as it needlessly prevents IRQ support in the absence of KVM_CAP_ASSIGN_DEV_IRQ. Let's just fix the core issue and replace the test for KVM_CAP_DEVICE_MSIX with a test call of KVM_ASSIGN_SET_MSIX_NR, passing in a NULL struct. If it returns -EFAULT, the IOCTL is known and MSIX is supported. Or just add KVM_CAP_DEVICE_MSIX to the kernel and backport it where needed? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware
[ -bouncing hiroshi.d...@nokia.com, +not-bouncing hd...@nvidia.com : hi Hiroshi :) ] Hi Joerg, On Mon, Oct 10, 2011 at 11:47 AM, Roedel, Joerg joerg.roe...@amd.com wrote: sorry, I was on vacation last week and had no time to look into this. Sure thing, thanks for replying! +#include linux/bitmap.h Is this still required? Nope, removed, thanks. Hmm, I thought a little bit about that and came to the conculusion it might be best to just keep the page-sizes as a part of the iommu_ops structure. So there is no need to extend the register_iommu interface. Sure. That was one of my initial alternatives, but I decided against it at that time. I'll bring it back - it will help with the bus_set_iommu rebasing. Also, the bus_set_iommu interface is now in the -next branch. Would be good if you rebase the patches to that interface. Sure. It's a little tricky though: which branch do I base this on ? Are you ok with me basing this on your 'next' branch ? My current stack depends at least on three branches of yours, so that would be helpful for me (and less merging conflicts for you I guess :). I think we need some care here and check pgsize for 0. A BUG_ON should do. I can add it if you prefer, but I don't think it can really happen: basically, it means that we chose a too small and unsupported page bit, which can't happen as long as we check for IS_ALIGNED(iova | paddr | size, iommu_min_pagesz) in the beginning of iommu_map. + unmapped_order = iommu_ops-unmap(domain, iova, order); I think we should make sure that we call iommu_ops-unmap with the same parameters as iommu_ops-map. Otherwise we still need some page-size complexity in the iommu-drivers. Ok, let's discuss the semantics of -unmap(). There isn't a clear documentation of that API (we should probably add some kernel docs after we nail it down now), but judging from the existing users (mainly kvm) and drivers, it seems that iommu_map() and iommu_unmap() aren't symmetric: users rely on unmap() to return the actual size that was unmapped. IOMMU drivers, in turn, should check which page is mapped on 'iova', unmap it, and return its size. This way iommu_unmap() becomes very simple: it just iterates through the region, relying on iommu_ops-unmap() to return the sizes that were actually unmapped (very similar to how amd's iommu_unmap_page works today). This also means that iommu_ops-unmap() doesn't really need a size/order argument and we can remove it (after all drivers fully migrate..). The other approach which you suggest means symmetric iommu_map() and iommu_unmap(). It means adding a 'paddr' parameter to iommu_unmap(), which is easy, but maybe more concerning is the limitation that it incurs: users will now have to call iommu_unmap() exactly as they called iommu_map() beforehand. Note sure how well this will fly with the existing users (kvm ?) and whether we really want to enforce this (it doesn't mean drivers need to deal with page-size complexity. they are required to unmap a single page at a time, and iommu_unmap() will do the work for them). Another discussion: I think we better change iommu_ops-map() to directly take a 'size' (in bytes) instead of an 'order' (of pages). Most (all?) drivers just immediately do 'size = 0x1000UL gfp_order', so this whole size - order - size back and forth seems redundant. When we pass the size now it makes sense to also return the unmapped-size instead of the order. Sure. Thanks for your review, Ohad. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware
On Mon, Oct 10, 2011 at 2:52 PM, KyongHo Cho pullip@samsung.com wrote: Do not we need to unmap all intermediate mappings if iommu_map() is failed? Good idea, I'll add it. Thanks! Ohad. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks
On Monday 10 October 2011, 07:00:50 Stephan Diestelhorst wrote: On Thursday 06 October 2011, 13:40:01 Jeremy Fitzhardinge wrote: On 10/06/2011 07:04 AM, Stephan Diestelhorst wrote: On Wednesday 28 September 2011, 14:49:56 Linus Torvalds wrote: Which certainly should *work*, but from a conceptual standpoint, isn't it just *much* nicer to say we actually know *exactly* what the upper bits were. Well, we really do NOT want atomicity here. What we really rather want is sequentiality: free the lock, make the update visible, and THEN check if someone has gone sleeping on it. Atomicity only conveniently enforces that the three do not happen in a different order (with the store becoming visible after the checking load). This does not have to be atomic, since spurious wakeups are not a problem, in particular not with the FIFO-ness of ticket locks. For that the fence, additional atomic etc. would be IMHO much cleaner than the crazy overflow logic. All things being equal I'd prefer lock-xadd just because its easier to analyze the concurrency for, crazy overflow tests or no. But if add+mfence turned out to be a performance win, then that would obviously tip the scales. However, it looks like locked xadd is also has better performance: on my Sandybridge laptop (2 cores, 4 threads), the add+mfence is 20% slower than locked xadd, so that pretty much settles it unless you think there'd be a dramatic difference on an AMD system. Indeed, the fences are usually slower than locked RMWs, in particular, if you do not need to add an instruction. I originally missed that amazing stunt the GCC pulled off with replacing the branch with carry flag magic. It seems that two twisted minds have found each other here :) One of my concerns was adding a branch in here... so that is settled, and if everybody else feels like this is easier to reason about... go ahead :) (I'll keep my itch to myself then.) Just that I can't... if performance is a concern, adding the LOCK prefix to the addb outperforms the xadd significantly: With mean over 100 runs... this comes out as follows (on my Phenom II) locked-add 0.648500 s 80% add-rmwtos 0.707700 s 88% locked-xadd 0.807600 s 100% add-barrier 1.27 s 157% With huge read contention added in (as cheaply as possible): locked-add.openmp 0.640700 s 84% add-rmwtos.openmp 0.658400 s 86% locked-xadd.openmp 0.763800 s 100% And the numbers for write contention are crazy, but also feature the locked-add version: locked-add.openmp 0.571400 s 71% add-rmwtos.openmp 0.699900 s 87% locked-xadd.openmp 0.800200 s 100% Stephan -- Stephan Diestelhorst, AMD Operating System Research Center stephan.diestelho...@amd.com, Tel. +49 (0)351 448 356 719 Advanced Micro Devices GmbH Einsteinring 24 85609 Aschheim Germany Geschaeftsfuehrer: Alberto Bozzo; Sitz: Dornach, Gemeinde Aschheim, Landkreis Muenchen Registergericht Muenchen, HRB Nr. 43632, WEEE-Reg-Nr: DE 12919551 #include stdio.h struct { unsigned char flag; unsigned char val; } l; int main(int argc, char **argv) { int i; { { for (i = 0; i 1; i++) { l.val += 2; asm volatile(lock or $0x0,(%%rsp) : : : memory); if (l.flag) break; asm volatile( : : : memory); } l.flag = 1; } } return 0; } #include stdio.h struct { unsigned char flag; unsigned char val; } l; int main(int argc, char **argv) { int i; # pragma omp sections { # pragma omp section { for (i = 0; i 1; i++) { l.val += 2; asm volatile(lock or $0x0,(%%rsp) : : : memory); if (l.flag) break; asm volatile( : : : memory); } l.flag = 1; } # pragma omp section while(!l.flag) asm volatile(:::memory); //asm volatile(lock orb $0x0, %0::m(l.flag):memory); } return 0; } #include stdio.h struct { unsigned char flag; unsigned char val; } l; int main(int argc, char **argv) { int i; { { for (i = 0; i 1; i++) { asm volatile(lock addb %1, %0:+m(l.val):r((char)2):memory); if (l.flag) break; asm volatile( : : : memory); } l.flag = 1; } } return 0; } #include stdio.h union { struct { unsigned char val; unsigned char flag; }; unsigned short lock; } l = { 0,0 }; int main(int argc, char **argv) { int i; # pragma omp sections { # pragma omp section { for (i = 0; i 1; i++) { unsigned short inc = 2; if (l.val = (0x100 - 2)) inc += -1 8; asm volatile(lock; xadd %1,%0 : +m (l.lock), +r (inc) : ); if (inc 0x100) break; asm volatile( : : : memory); } l.flag = 1; } # pragma omp section while(!l.flag) asm volatile(:::memory); //asm volatile(lock orb $0x0, %0::m(l.flag):memory); } return 0; } #include stdio.h struct { unsigned char flag; unsigned char val; } l; int main(int argc, char **argv) { int i; # pragma omp sections { # pragma omp section
Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue
On 10/03/2011 03:55 PM, Marcelo Tosatti wrote: The following changes since commit d11cf8cc80d946dfc9a23597cd9a0bb1c487cfa7: etrax-dma: Remove bogus if statement (2011-10-03 10:20:13 +0200) are available in the git repository at: git://github.com/avikivity/qemu.git uq/master Pulled. Thanks. Are ya'll planning on moving your repo back to kernel.org or sticking with github? Regards, Anthony Liguori Liu, Jinsong (1): kvm: support TSC deadline MSR target-i386/cpu.h |4 +++- target-i386/kvm.c | 14 ++ target-i386/machine.c |1 + 3 files changed, 18 insertions(+), 1 deletions(-) -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue
On 10/10/2011 04:41 PM, Anthony Liguori wrote: On 10/03/2011 03:55 PM, Marcelo Tosatti wrote: The following changes since commit d11cf8cc80d946dfc9a23597cd9a0bb1c487cfa7: etrax-dma: Remove bogus if statement (2011-10-03 10:20:13 +0200) are available in the git repository at: git://github.com/avikivity/qemu.git uq/master Pulled. Thanks. Um, this had a comment about it regarding s/version bump/subsection/ Are ya'll planning on moving your repo back to kernel.org or sticking with github? We'll move back to kernel.org as soon as we sort around the keys. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 1/1] kvm: support TSC deadline MSR
On 10/04/2011 05:20 PM, Marcelo Tosatti wrote: On Tue, Oct 04, 2011 at 07:53:42PM +0200, Avi Kivity wrote: On 10/03/2011 10:55 PM, Marcelo Tosatti wrote: From: Liu, Jinsongjinsong@intel.com KVM add emulation of lapic tsc deadline timer for guest. This patch is co-operation work at qemu side. -#define CPU_SAVE_VERSION 12 +#define CPU_SAVE_VERSION 13 Unfortunate. Can't we use subsections? Yes, i'll look into it tomorrow. Subsections are still broken at the moment although Juan has some patches. Bumping the version is the safe thing to do. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue
On 10/10/2011 09:48 AM, Avi Kivity wrote: On 10/10/2011 04:41 PM, Anthony Liguori wrote: On 10/03/2011 03:55 PM, Marcelo Tosatti wrote: The following changes since commit d11cf8cc80d946dfc9a23597cd9a0bb1c487cfa7: etrax-dma: Remove bogus if statement (2011-10-03 10:20:13 +0200) are available in the git repository at: git://github.com/avikivity/qemu.git uq/master Pulled. Thanks. Um, this had a comment about it regarding s/version bump/subsection/ Hrm, sorry about that. In the future, it would be helpful to explicitly withdrawal a PULL request. Do you want me to revert? FWIW, I think bumping the version is the right thing to do. Regards, Anthony Liguori Are ya'll planning on moving your repo back to kernel.org or sticking with github? We'll move back to kernel.org as soon as we sort around the keys. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 1/1] kvm: support TSC deadline MSR
On 10/10/2011 04:54 PM, Anthony Liguori wrote: On 10/04/2011 05:20 PM, Marcelo Tosatti wrote: On Tue, Oct 04, 2011 at 07:53:42PM +0200, Avi Kivity wrote: On 10/03/2011 10:55 PM, Marcelo Tosatti wrote: From: Liu, Jinsongjinsong@intel.com KVM add emulation of lapic tsc deadline timer for guest. This patch is co-operation work at qemu side. -#define CPU_SAVE_VERSION 12 +#define CPU_SAVE_VERSION 13 Unfortunate. Can't we use subsections? Yes, i'll look into it tomorrow. Subsections are still broken at the moment although Juan has some patches. Bumping the version is the safe thing to do. It's irreversible, once we release a version with a bumped ID we can't go back. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue
On 10/10/2011 04:55 PM, Anthony Liguori wrote: Hrm, sorry about that. In the future, it would be helpful to explicitly withdrawal a PULL request. Do you want me to revert? We'll send the revert together with the new patch. FWIW, I think bumping the version is the right thing to do. Why? -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] Debian preseed support
This patchset adds support to debian preseed files http://wiki.debian.org/DebianInstaller/Preseed Comes with Ubuntu server 11.04 support. Later, more patches adding Debian and other Ubuntu server variants will be added. This patchset was also sent as a pull request https://github.com/autotest/autotest/pull/34 Please review and comment. Lucas Meneghel Rodrigues (2): KVM test: Introduce debian preseed unattended file support KVM test: guest-os.cfg: Introduce Ubuntu 11.04 server variant client/tests/kvm/guest-os.cfg.sample | 34 ++ client/tests/kvm/tests/unattended_install.py | 31 client/tests/kvm/unattended/Ubuntu-11-04.preseed | 42 ++ 3 files changed, 100 insertions(+), 7 deletions(-) create mode 100644 client/tests/kvm/unattended/Ubuntu-11-04.preseed -- 1.7.6.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] KVM test: Introduce debian preseed unattended file support
Add support to debian preseed http://wiki.debian.org/DebianInstaller/Preseed unattended install file format. In order to get fully automated d-i automation, we are using initrd preseed method (add a preseed.cfg file on top of the initrd filesystem). Tested with Ubuntu server 11.04, will add other debian and debian based OS variants on later patches. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- client/tests/kvm/tests/unattended_install.py | 31 ++ 1 files changed, 31 insertions(+), 0 deletions(-) diff --git a/client/tests/kvm/tests/unattended_install.py b/client/tests/kvm/tests/unattended_install.py index b1d23f6..f3f5268 100644 --- a/client/tests/kvm/tests/unattended_install.py +++ b/client/tests/kvm/tests/unattended_install.py @@ -407,6 +407,34 @@ class UnattendedInstallConfig(object): doc.writexml(fp) +def preseed_initrd(self): + +Puts a preseed file inside a gz compressed initrd file. + +Debian and Ubuntu use preseed as the OEM install mechanism. The only +way to get fully automated setup without resorting to kernel params +is to add a preseed.cfg file at the root of the initrd image. + +logging.debug(Remastering initrd.gz file with preseed file) +dest_fname = 'preseed.cfg' +remaster_path = os.path.join(self.image_path, initrd_remaster) +os.makedirs(remaster_path) + +os.chdir(remaster_path) +utils.run(gzip -d ../%s | cpio --extract --make-directories + --no-absolute-filenames % os.path.basename(self.initrd)) +utils.run(cp %s %s % (self.unattended_file, dest_fname)) +utils.run(find . | cpio -H newc --create | gzip -9 ../%s % + os.path.basename(self.initrd)) +os.chdir(self.image_path) +utils.run(rm -rf initrd_remaster) +contents = open(self.unattended_file).read() + +logging.debug(Unattended install contents:) +for line in contents.splitlines(): +logging.debug(line) + + def setup_boot_disk(self): if self.unattended_file.endswith('.sif'): dest_fname = 'winnt.sif' @@ -492,6 +520,9 @@ class UnattendedInstallConfig(object): (self.cdrom_cd1_mount, self.boot_path, os.path.basename(self.initrd), self.initrd)) utils.run(initrd_fetch_cmd) +if self.unattended_file.endswith('.preseed'): +self.preseed_initrd() + finally: cleanup(self.cdrom_cd1_mount) -- 1.7.6.4 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] KVM test: guest-os.cfg: Introduce Ubuntu 11.04 server variant
Add a Ubuntu 11.04 server variant, with unattended install set. With this, it's possible to install the latest Ubuntu server (as of the time of this patch). A preseed file comes together. Signed-off-by: Lucas Meneghel Rodrigues l...@redhat.com --- client/tests/kvm/guest-os.cfg.sample | 34 ++ client/tests/kvm/unattended/Ubuntu-11-04.preseed | 42 ++ 2 files changed, 69 insertions(+), 7 deletions(-) create mode 100644 client/tests/kvm/unattended/Ubuntu-11-04.preseed diff --git a/client/tests/kvm/guest-os.cfg.sample b/client/tests/kvm/guest-os.cfg.sample index 17d6114..f7d5a98 100644 --- a/client/tests/kvm/guest-os.cfg.sample +++ b/client/tests/kvm/guest-os.cfg.sample @@ -906,29 +906,35 @@ variants: md5sum_cd1 = d2e10420f3689faa49a004b60fb396b7 md5sum_1m_cd1 = f7f67b5da46923a9f01da8a2b6909654 -- @Ubuntu: +- Ubuntu: shell_prompt = ^root@.*[\#\$]\s*$ +password = 12345678 +image_name = ubuntu +unattended_install: +kernel = linux +initrd = initrd +wait_no_ack = yes variants: -- Ubuntu-6.10-32: +- 6.10-32: only install -image_name = ubuntu-6.10-32 +image_name += -6.10-32 steps = steps/Ubuntu-6.10-32.steps cdrom_cd1 = isos/linux/ubuntu-6.10-desktop-i386.iso md5sum_cd1 = 17fb825641571ce5888a718329efd016 md5sum_1m_cd1 = 7531d0a84e7451d17c5d976f1c3f8509 -- Ubuntu-8.04-32: +- 8.04-32: skip = yes -image_name = ubuntu-8.04-32 +image_name += -8.04-32 install: steps = steps/Ubuntu-8.04-32.steps cdrom_cd1 = isos/linux/ubuntu-8.04.1-desktop-i386.iso setup: steps = steps/Ubuntu-8.04-32-setupssh.steps -- Ubuntu-8.10-server-32: -image_name = ubuntu-8.10-server-32 +- 8.10-server-32: +image_name += -8.10-server-32 install: steps = steps/Ubuntu-8.10-server-32.steps cdrom_cd1 = isos/linux/ubuntu-8.10-server-i386.iso @@ -937,6 +943,20 @@ variants: setup: steps = steps/Ubuntu-8.10-server-32-gcc.steps +- 11.04-server-64: +image_name += -11.04-server-64 +unattended_install: +extra_params += --append 'console=ttyS0,115200 console=tty0' +kernel = images/ubuntu-server-11-04-64/vmlinuz +initrd = images/ubuntu-server-11-04-64/initrd.gz +boot_path = install +unattended_install.cdrom: +unattended_file = unattended/Ubuntu-11-04.preseed +cdrom_cd1 = isos/linux/ubuntu-11.04-server-amd64.iso +md5sum_cd1 = 355ca2417522cb4a77e0295bf45c5cd5 +md5sum_1m_cd1 = 65b1514744bf99e88f6228e9b6f152a8 + + - DSL-4.2.5: no setup dbench bonnie linux_s3 image_name = dsl-4.2.5 diff --git a/client/tests/kvm/unattended/Ubuntu-11-04.preseed b/client/tests/kvm/unattended/Ubuntu-11-04.preseed new file mode 100644 index 000..b4bec84 --- /dev/null +++ b/client/tests/kvm/unattended/Ubuntu-11-04.preseed @@ -0,0 +1,42 @@ +debconf debconf/priority string critical +unknown debconf/priority string critical +d-i debconf/priority string critical +d-i debian-installer/locale string en_US +d-i console-tools/archs select at +d-i console-keymaps-at/keymap select us + +d-i netcfg/choose_interface select auto +d-i netcfg/get_hostname string unassigned-hostname +d-i netcfg/get_domain string unassigned-domain +d-i netcfg/wireless_wep string + +d-i clock-setup/utc boolean true +d-i time/zone string US/Eastern + +d-i partman-auto/method string regular +d-i partman-auto/choose_recipe select home +d-i partman/confirm_write_new_label boolean true +d-i partman/choose_partition select finish +d-i partman/confirm boolean true +d-i partman/confirm_nooverwrite boolean true + +d-i passwd/root-login boolean true +d-i passwd/make-user boolean false +d-i passwd/root-password password 12345678 +d-i passwd/root-password-again password 12345678 + +tasksel tasksel/first multiselect standard + +d-i pkgsel/include string openssh-server build-essential + +d-i
Re: [Qemu-devel] [PATCH 1/1] kvm: support TSC deadline MSR
On 10/10/2011 09:58 AM, Avi Kivity wrote: On 10/10/2011 04:54 PM, Anthony Liguori wrote: On 10/04/2011 05:20 PM, Marcelo Tosatti wrote: On Tue, Oct 04, 2011 at 07:53:42PM +0200, Avi Kivity wrote: On 10/03/2011 10:55 PM, Marcelo Tosatti wrote: From: Liu, Jinsongjinsong@intel.com KVM add emulation of lapic tsc deadline timer for guest. This patch is co-operation work at qemu side. -#define CPU_SAVE_VERSION 12 +#define CPU_SAVE_VERSION 13 Unfortunate. Can't we use subsections? Yes, i'll look into it tomorrow. Subsections are still broken at the moment although Juan has some patches. Bumping the version is the safe thing to do. It's irreversible, once we release a version with a bumped ID we can't go back. But the question is whether we've bumped *any* versions of common devices since 0.15 because if so, it's moot here. Once any device bumps a version id, migration is incompatible. Subsections are nice for stable branches, but they don't solve inter-version compatibility. Most importantly, subsections are broken today so until we straighten things out there, we can't rely on them. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue
On 10/10/2011 10:00 AM, Avi Kivity wrote: On 10/10/2011 04:55 PM, Anthony Liguori wrote: Hrm, sorry about that. In the future, it would be helpful to explicitly withdrawal a PULL request. Do you want me to revert? We'll send the revert together with the new patch. FWIW, I think bumping the version is the right thing to do. Why? Because we still haven't fixed subsections. Juan's patches help but they can still result in sending bad data to and older QEMU that won't handle it properly. We need to figure out a proper fix for subsections, either Paolo's protocol change or moving subsections out to a detected section. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue
On 10/10/2011 05:12 PM, Anthony Liguori wrote: On 10/10/2011 10:00 AM, Avi Kivity wrote: On 10/10/2011 04:55 PM, Anthony Liguori wrote: Hrm, sorry about that. In the future, it would be helpful to explicitly withdrawal a PULL request. Do you want me to revert? We'll send the revert together with the new patch. FWIW, I think bumping the version is the right thing to do. Why? Because we still haven't fixed subsections. Juan's patches help but they can still result in sending bad data to and older QEMU that won't handle it properly. We need to figure out a proper fix for subsections, either Paolo's protocol change or moving subsections out to a detected section. Ok. So bumping the version is only right if we don't fix subsections. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 1/1] kvm: support TSC deadline MSR
On 10/10/2011 05:10 PM, Anthony Liguori wrote: It's irreversible, once we release a version with a bumped ID we can't go back. But the question is whether we've bumped *any* versions of common devices since 0.15 because if so, it's moot here. What's the answer? And if the answer is we did, why did we? Once any device bumps a version id, migration is incompatible. Subsections are nice for stable branches, but they don't solve inter-version compatibility. Why not? Though I agree it's a long shot to get it to work, since there are so many changes and it's easy to get any one of them wrong. You had an idea once to fingerprint the migration format and compare it across versions? I assume the almighty visitor can make this very simple? Most importantly, subsections are broken today so until we straighten things out there, we can't rely on them. Let's not throw a permanent baby out with the temporary bath water. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware
Hi Ohad, On Mon, Oct 10, 2011 at 09:59:22AM -0400, Ohad Ben-Cohen wrote: Also, the bus_set_iommu interface is now in the -next branch. Would be good if you rebase the patches to that interface. Sure. It's a little tricky though: which branch do I base this on ? Are you ok with me basing this on your 'next' branch ? My current stack depends at least on three branches of yours, so that would be helpful for me (and less merging conflicts for you I guess :). The master branch is best to base your patches on for generic work. For more specific things like omap-only changes you can use the topic branches. I think we need some care here and check pgsize for 0. A BUG_ON should do. I can add it if you prefer, but I don't think it can really happen: basically, it means that we chose a too small and unsupported page bit, which can't happen as long as we check for IS_ALIGNED(iova | paddr | size, iommu_min_pagesz) in the beginning of iommu_map. It can happen when there is a bug somewhere :) So a BUG_ON will yell then and makes debugging easier. An alternative is to use a WARN_ON and let the map-call fail in this case. Ok, let's discuss the semantics of -unmap(). There isn't a clear documentation of that API (we should probably add some kernel docs after we nail it down now), but judging from the existing users (mainly kvm) and drivers, it seems that iommu_map() and iommu_unmap() aren't symmetric: users rely on unmap() to return the actual size that was unmapped. IOMMU drivers, in turn, should check which page is mapped on 'iova', unmap it, and return its size. Right, currently the map/unmap calls are not symetric. But I think they should be to get a clean semantic. Without this requirement and multiple page-sizes in use the iommu-code may has to unmap more address space then requested. The user doesn't know what will be unmapped so it has to make sure that no DMA is happening while unmap runs. When we require the calls to be symetric we can give a guarantee that only the requested region is unmapped and allow DMA to the untouched part of the address-space while unmap() is running. So when the call-places to not follow this restriction we should convert them mid-term. This way iommu_unmap() becomes very simple: it just iterates through the region, relying on iommu_ops-unmap() to return the sizes that were actually unmapped (very similar to how amd's iommu_unmap_page works today). This also means that iommu_ops-unmap() doesn't really need a size/order argument and we can remove it (after all drivers fully migrate..). Yes, somthing like that. Probably the iommu_ops-unmap function should be turned into a unmap_page function call which only takes an iova and no size parameter. The iommu-driver unmaps the page pointing to that iova and returns the size of the page unmapped. This still allows the simple implementation for the unmap-call. This change is no requirement for this patch-set, but if we agree on it this patch-set should keep that direction in mind. The other approach which you suggest means symmetric iommu_map() and iommu_unmap(). It means adding a 'paddr' parameter to iommu_unmap(), which is easy, but maybe more concerning is the limitation that it incurs: users will now have to call iommu_unmap() exactly as they called iommu_map() beforehand. Note sure how well this will fly with the existing users (kvm ?) and whether we really want to enforce this (it doesn't mean drivers need to deal with page-size complexity. they are required to unmap a single page at a time, and iommu_unmap() will do the work for them). It will work with KVM, that is no problem. We don't need to really enforce the calls to be symetric. But we can define that we only give the guarantee about what will be unmapped when the calls are symetric. For example: iommu_map( 0, 0x10); iommu_unmap(0, 0x10); /* Guarantee that it will only unmap the range 0-0x10 */ whereas: iommu_map( 0, 0x10); iommu_unmap(0, 0x1000); /* Guarantees that 0-0x1000 is unmapped, but other undefined parts of the address space may be unmapped too, up to the whole address space */ The alternative is that we implement page-splitting in the iommu_unmap function. But that introduces complexity I am not sure we really need. KVM for example just unmaps the whole address-space on destruction. For the generic dma_ops this is also not required because the dma_map* functions already have the requirement to be symetric. Another discussion: I think we better change iommu_ops-map() to directly take a 'size' (in bytes) instead of an 'order' (of pages). Most (all?) drivers just immediately do 'size = 0x1000UL gfp_order', so this whole size - order - size back and forth seems redundant.
Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue
On 10/10/2011 10:24 AM, Avi Kivity wrote: On 10/10/2011 05:12 PM, Anthony Liguori wrote: On 10/10/2011 10:00 AM, Avi Kivity wrote: On 10/10/2011 04:55 PM, Anthony Liguori wrote: Hrm, sorry about that. In the future, it would be helpful to explicitly withdrawal a PULL request. Do you want me to revert? We'll send the revert together with the new patch. FWIW, I think bumping the version is the right thing to do. Why? Because we still haven't fixed subsections. Juan's patches help but they can still result in sending bad data to and older QEMU that won't handle it properly. We need to figure out a proper fix for subsections, either Paolo's protocol change or moving subsections out to a detected section. Ok. So bumping the version is only right if we don't fix subsections. If we bump *any* version from 0.15 - 1.0, then there's no point at all in having a subsection. If we break compatibility by using Paolo's new protocol, or doing subsections as sections, then there's no point in making it a subsection either. Regards, Anthony Liguori -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Qemu-devel] [PATCH 0/1] [PULL] qemu-kvm.git uq/master queue
On 10/10/2011 05:36 PM, Anthony Liguori wrote: Ok. So bumping the version is only right if we don't fix subsections. If we bump *any* version from 0.15 - 1.0, then there's no point at all in having a subsection. Did we bump versions of relevant devices? If we break compatibility by using Paolo's new protocol, or doing subsections as sections, then there's no point in making it a subsection either. These are workaroundable. For example if you migrate 0.15 to 1.0 you start the destination with -old-subsection-format. Even if you don't, since subsections are rarely present, migration will succeed. -- error compiling committee.c: too many arguments to function -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Fix include dependency for mmu_notifier
The kvm_host struct can include an mmu_notifier struct but mmu_notifier.h is not included directly. Signed-off-by: Eric B Munson emun...@mgebm.net --- include/linux/kvm_host.h |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index eabb21a..d2c79a9 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -14,6 +14,7 @@ #include linux/signal.h #include linux/sched.h #include linux/mm.h +#include linux/mmu_notifier.h #include linux/preempt.h #include linux/msi.h #include linux/slab.h -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio-9p: fix QEMU build break
On Mon, 10 Oct 2011 18:30:28 +0800, Zhi Yong Wu wu...@linux.vnet.ibm.com wrote: qemu build break due to the redefinition of struct file_handle. My qemu.git/HEAD is 8acbc9b21d757a6be4f8492e547b8159703a0547 Below is the log: [root@f15 qemu]# make CCqapi-generated/qga-qapi-types.o LINK qemu-ga CClibhw64/9pfs/virtio-9p-handle.o /home/zwu/work/virt/qemu/hw/9pfs/virtio-9p-handle.c:31:8: error: redefinition of struct file_handle /usr/include/bits/fcntl.h:254:8: note: originally defined here make[1]: *** [9pfs/virtio-9p-handle.o] Error 1 make: *** [subdir-libhw64] Error 2 [root@f15 qemu]# rpm -qf /usr/include/bits/fcntl.h glibc-headers-2.13.90-9.x86_64 Is this a backported glibc ? On my ubuntu system glibc 2.13 doesn't provide struct file_handle. I also checked glib repo at http://repo.or.cz/w/glibc.git. The commit introducing struct file_handle is $ git describe --contains 158648c0bdda281e252a27c0200dd0ea6f4e0215 glibc-2.14~200 Signed-off-by: Zhi Yong Wu wu...@linux.vnet.ibm.com --- hw/9pfs/virtio-9p-handle.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c index 5c8b5ed..5b3a867 100644 --- a/hw/9pfs/virtio-9p-handle.c +++ b/hw/9pfs/virtio-9p-handle.c @@ -27,7 +27,7 @@ struct handle_data { int handle_bytes; }; -#if __GLIBC__ = 2 __GLIBC_MINOR__ 14 +#if __GLIBC__ = 2 __GLIBC_MINOR__ 13 struct file_handle { unsigned int handle_bytes; int handle_type; -- -aneesh -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] virtio-9p: fix QEMU build break
On Mon, 10 Oct 2011 22:05:21 +0530, Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote: On Mon, 10 Oct 2011 18:30:28 +0800, Zhi Yong Wu wu...@linux.vnet.ibm.com wrote: qemu build break due to the redefinition of struct file_handle. My qemu.git/HEAD is 8acbc9b21d757a6be4f8492e547b8159703a0547 Below is the log: [root@f15 qemu]# make CCqapi-generated/qga-qapi-types.o LINK qemu-ga CClibhw64/9pfs/virtio-9p-handle.o /home/zwu/work/virt/qemu/hw/9pfs/virtio-9p-handle.c:31:8: error: redefinition of struct file_handle /usr/include/bits/fcntl.h:254:8: note: originally defined here make[1]: *** [9pfs/virtio-9p-handle.o] Error 1 make: *** [subdir-libhw64] Error 2 [root@f15 qemu]# rpm -qf /usr/include/bits/fcntl.h glibc-headers-2.13.90-9.x86_64 Is this a backported glibc ? On my ubuntu system glibc 2.13 doesn't provide struct file_handle. I also checked glib repo at http://repo.or.cz/w/glibc.git. The commit introducing struct file_handle is $ git describe --contains 158648c0bdda281e252a27c0200dd0ea6f4e0215 glibc-2.14~200 How about the below patch. This means that handle driver will only work with latest glibc. Even if i have latest kernel, with an older glibc handle fs driver backed will be disabled. diff --git a/configure b/configure index 24b8df4..0216c53 100755 --- a/configure +++ b/configure @@ -2551,6 +2551,18 @@ EOF fi ## +# check if we have open_by_handle_at + +open_by_hande_at=no +cat $TMPC EOF +#include fcntl.h +int main(void) { struct file_handle *fh; open_by_handle_at(0, fh, 0); } +EOF +if compile_prog ; then +open_by_handle_at=yes +fi + +## # End of CC checks # After here, no more $cc or $ld runs @@ -3029,6 +3041,10 @@ if test $ucontext_coroutine = yes ; then echo CONFIG_UCONTEXT_COROUTINE=y $config_host_mak fi +if test $open_by_handle_at = yes ; then + echo CONFIG_OPEN_BY_HANDLE=y $config_host_mak +fi + # USB host support case $usb in linux) diff --git a/hw/9pfs/virtio-9p-handle.c b/hw/9pfs/virtio-9p-handle.c index 68e1d9b..bd73d31 100644 --- a/hw/9pfs/virtio-9p-handle.c +++ b/hw/9pfs/virtio-9p-handle.c @@ -30,13 +30,24 @@ struct handle_data { int handle_bytes; }; -#if __GLIBC__ = 2 __GLIBC_MINOR__ 14 +#ifdef CONFIG_OPEN_BY_HANDLE +static inline int name_to_handle(int dirfd, const char *name, + struct file_handle *fh, int *mnt_id, int flags) +{ +return name_to_handle_at(dirfd, name, fh, mnt_id, flags); +} + +static inline int open_by_handle(int mountfd, const char *fh, int flags) +{ +return open_by_handle_at(mountfd, fh, flags); +} +#else + struct file_handle { -unsigned int handle_bytes; -int handle_type; -unsigned char handle[0]; +unsigned int handle_bytes; +int handle_type; +unsigned char handle[0]; }; -#endif #ifndef AT_EMPTY_PATH #define AT_EMPTY_PATH 0x1000 /* Allow empty relative pathname */ @@ -45,28 +56,6 @@ struct file_handle { #define O_PATH01000 #endif -#ifndef __NR_name_to_handle_at -#if defined(__i386__) -#define __NR_name_to_handle_at 341 -#define __NR_open_by_handle_at 342 -#elif defined(__x86_64__) -#define __NR_name_to_handle_at 303 -#define __NR_open_by_handle_at 304 -#endif -#endif - -#ifdef __NR_name_to_handle_at -static inline int name_to_handle(int dirfd, const char *name, - struct file_handle *fh, int *mnt_id, int flags) -{ -return syscall(__NR_name_to_handle_at, dirfd, name, fh, mnt_id, flags); -} - -static inline int open_by_handle(int mountfd, const char *fh, int flags) -{ -return syscall(__NR_open_by_handle_at, mountfd, fh, flags); -} -#else static inline int name_to_handle(int dirfd, const char *name, struct file_handle *fh, int *mnt_id, int flags) { -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware
Hi Joerg, On Mon, Oct 10, 2011 at 5:36 PM, Roedel, Joerg joerg.roe...@amd.com wrote: The master branch is best to base your patches on for generic work. Oh, great. thanks. It can happen when there is a bug somewhere :) Hmm, bug ? ;) Ok, I'll add a BUG_ON :) Yes, somthing like that. Probably the iommu_ops-unmap function should be turned into a unmap_page function call which only takes an iova and no size parameter. The iommu-driver unmaps the page pointing to that iova and returns the size of the page unmapped. This still allows the simple implementation for the unmap-call. Yes, exactly. It will take some time to migrate all drivers (today we have 4 drivers, each of which is implementing a slightly different -unmap() semantics), but at least let's not accept any new driver that doesn't adhere to this, otherwise it's going to be even harder for the API to evolve. This change is no requirement for this patch-set, but if we agree on it this patch-set should keep that direction in mind. Definitely, thanks. We don't need to really enforce the calls to be symetric. But we can define that we only give the guarantee about what will be unmapped when the calls are symetric. Sounds good to me. I'll add this to the kernel doc patch (which I'll submit after this patch set materializes), and when/if we move to symmetric only, we will update it. The alternative is that we implement page-splitting in the iommu_unmap function. But that introduces complexity I am not sure we really need. Yeah, me neither. Yes, this get_order thing should be changes to size long-term. Good. That should be a simple change, I'll do it after this patch set. Thanks, Ohad. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks
On 10/10/2011 07:01 AM, Stephan Diestelhorst wrote: On Monday 10 October 2011, 07:00:50 Stephan Diestelhorst wrote: On Thursday 06 October 2011, 13:40:01 Jeremy Fitzhardinge wrote: On 10/06/2011 07:04 AM, Stephan Diestelhorst wrote: On Wednesday 28 September 2011, 14:49:56 Linus Torvalds wrote: Which certainly should *work*, but from a conceptual standpoint, isn't it just *much* nicer to say we actually know *exactly* what the upper bits were. Well, we really do NOT want atomicity here. What we really rather want is sequentiality: free the lock, make the update visible, and THEN check if someone has gone sleeping on it. Atomicity only conveniently enforces that the three do not happen in a different order (with the store becoming visible after the checking load). This does not have to be atomic, since spurious wakeups are not a problem, in particular not with the FIFO-ness of ticket locks. For that the fence, additional atomic etc. would be IMHO much cleaner than the crazy overflow logic. All things being equal I'd prefer lock-xadd just because its easier to analyze the concurrency for, crazy overflow tests or no. But if add+mfence turned out to be a performance win, then that would obviously tip the scales. However, it looks like locked xadd is also has better performance: on my Sandybridge laptop (2 cores, 4 threads), the add+mfence is 20% slower than locked xadd, so that pretty much settles it unless you think there'd be a dramatic difference on an AMD system. Indeed, the fences are usually slower than locked RMWs, in particular, if you do not need to add an instruction. I originally missed that amazing stunt the GCC pulled off with replacing the branch with carry flag magic. It seems that two twisted minds have found each other here :) One of my concerns was adding a branch in here... so that is settled, and if everybody else feels like this is easier to reason about... go ahead :) (I'll keep my itch to myself then.) Just that I can't... if performance is a concern, adding the LOCK prefix to the addb outperforms the xadd significantly: Hm, yes. So using the lock prefix on add instead of the mfence? Hm. J -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Xen-devel] [PATCH 00/10] [PATCH RFC V2] Paravirtualized ticketlocks
On 10/10/2011 12:32 AM, Ingo Molnar wrote: * Jeremy Fitzhardinge jer...@goop.org wrote: On 10/06/2011 10:40 AM, Jeremy Fitzhardinge wrote: However, it looks like locked xadd is also has better performance: on my Sandybridge laptop (2 cores, 4 threads), the add+mfence is 20% slower than locked xadd, so that pretty much settles it unless you think there'd be a dramatic difference on an AMD system. Konrad measures add+mfence is about 65% slower on AMD Phenom as well. xadd also results in smaller/tighter code, right? Not particularly, mostly because of the overflow-into-the-high-part compensation. But its only a couple of extra instructions, and no conditionals, so I don't think it would have any concrete effect. But, as Stephen points out, perhaps locked add is preferable to locked xadd, since it also has the same barrier as mfence but has (significantly!) better performance than either mfence or locked xadd... J -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Fwd: qemudParsePCIDeviceStrs warning
Hy, I have some strange entries in syslog, to which I couldnt really find any information on google. One forum said to disable apparmor, but I havent any running on system: root@muramasa:~# grep libvirt /var/log/syslog Oct 10 12:03:55 muramasa libvirtd: 12:03:55.308: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed Oct 10 12:03:55 muramasa libvirtd: 12:03:55.905: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed Oct 10 12:03:56 muramasa libvirtd: 12:03:56.545: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed Oct 10 12:03:57 muramasa libvirtd: 12:03:57.292: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed Oct 10 12:03:57 muramasa libvirtd: 12:03:57.928: warning : qemudParsePCIDeviceStrs:1422 : Unexpected exit status '1', qemu probably failed some information about hw: CPU: AMD Phenom(tm) II X6 1090T Processor kvm version: QEMU PC emulator version 0.12.5 (qemu-kvm-0.12.5) on ubuntu squeeze: 2.6.32-5-amd64 arch: x86-64 6 guests with debian squeeze 64bit guests start after a reboot so not sure why I see these in syslog. In messages I see one extra line but I am guessing this is from reboot: Oct 10 10:47:14 muramasa libvirtd: 10:47:14.135: warning : qemudDispatchSignalEvent:396 : Shutting down on signal 15 Thanks for the info in advance. Krisztian -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware
On Mon, Oct 10, 2011 at 5:36 PM, Roedel, Joerg joerg.roe...@amd.com wrote: The master branch is best to base your patches on for generic work. It looks like the master branch is missing something like this: From acb316aa4bcaf383e8cb1580e30c8635e0a34369 Mon Sep 17 00:00:00 2001 From: Ohad Ben-Cohen o...@wizery.com Date: Mon, 10 Oct 2011 23:55:51 +0200 Subject: [PATCH] iommu/core: fix build issue Fix this: drivers/iommu/iommu.c: In function 'iommu_commit': drivers/iommu/iommu.c:291: error: 'iommu_ops' undeclared (first use in this function) Signed-off-by: Ohad Ben-Cohen o...@wizery.com --- drivers/iommu/iommu.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 909b0d2..a5131f1 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -288,7 +288,7 @@ EXPORT_SYMBOL_GPL(iommu_unmap); void iommu_commit(struct iommu_domain *domain) { - if (iommu_ops-commit) - iommu_ops-commit(domain); + if (domain-ops-commit) + domain-ops-commit(domain); } EXPORT_SYMBOL_GPL(iommu_commit); -- 1.7.4.1 -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 1/6] iommu/core: split mapping to page sizes as supported by the hardware
Hi Joerg, On Mon, Oct 10, 2011 at 5:36 PM, Roedel, Joerg joerg.roe...@amd.com wrote: The master branch is best to base your patches on for generic work. Done. I've revised the patches and attached the main one below; please tell me if it looks ok, and then I'll resubmit the entire patch set. Thanks, Ohad. commit bf1d730b5f4f7631becfcd4be52693d85bfea36b Author: Ohad Ben-Cohen o...@wizery.com Date: Mon Oct 10 23:50:55 2011 +0200 iommu/core: split mapping to page sizes as supported by the hardware When mapping a memory region, split it to page sizes as supported by the iommu hardware. Always prefer bigger pages, when possible, in order to reduce the TLB pressure. The logic to do that is now added to the IOMMU core, so neither the iommu drivers themselves nor users of the IOMMU API have to duplicate it. This allows a more lenient granularity of mappings; traditionally the IOMMU API took 'order' (of a page) as a mapping size, and directly let the low level iommu drivers handle the mapping, but now that the IOMMU core can split arbitrary memory regions into pages, we can remove this limitation, so users don't have to split those regions by themselves. Currently the supported page sizes are advertised once and they then remain static. That works well for OMAP and MSM but it would probably not fly well with intel's hardware, where the page size capabilities seem to have the potential to be different between several DMA remapping devices. register_iommu() currently sets a default pgsize behavior, so we can convert the IOMMU drivers in subsequent patches, and after all the drivers are converted, register_iommu will be changed (and the temporary default settings will be removed). Mainline users of the IOMMU API (kvm and omap-iovmm) are adopted to send the mapping size in bytes instead of in page order. Many thanks to Joerg Roedel joerg.roe...@amd.com for significant review! Signed-off-by: Ohad Ben-Cohen o...@wizery.com Cc: David Brown dav...@codeaurora.org Cc: David Woodhouse dw...@infradead.org Cc: Joerg Roedel joerg.roe...@amd.com Cc: Stepan Moskovchenko step...@codeaurora.org Cc: KyongHo Cho pullip@samsung.com Cc: Hiroshi DOYU hd...@nvidia.com Cc: Laurent Pinchart laurent.pinch...@ideasonboard.com Cc: kvm@vger.kernel.org diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 73778b7..909b0d2 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -16,6 +16,8 @@ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA */ +#define pr_fmt(fmt)%s: fmt, __func__ + #include linux/device.h #include linux/kernel.h #include linux/bug.h @@ -47,6 +49,19 @@ int bus_set_iommu(struct bus_type *bus, struct iommu_ops *ops) if (bus-iommu_ops != NULL) return -EBUSY; + /* +* Set the default pgsize values, which retain the existing +* IOMMU API behavior: drivers will be called to map +* regions that are sized/aligned to order of 4KiB pages. +* +* This will be removed once all drivers are migrated. +*/ + if (!ops-pgsize_bitmap) + ops-pgsize_bitmap = ~0xFFFUL; + + /* find out the minimum page size only once */ + ops-min_pagesz = 1 __ffs(ops-pgsize_bitmap); + bus-iommu_ops = ops; /* Do IOMMU specific setup for this bus-type */ @@ -157,33 +172,117 @@ int iommu_domain_has_cap(struct iommu_domain *domain, EXPORT_SYMBOL_GPL(iommu_domain_has_cap); int iommu_map(struct iommu_domain *domain, unsigned long iova, - phys_addr_t paddr, int gfp_order, int prot) + phys_addr_t paddr, int size, int prot) { - size_t size; + unsigned long orig_iova = iova; + int ret = 0, orig_size = size; if (unlikely(domain-ops-map == NULL)) return -ENODEV; - size = PAGE_SIZE gfp_order; + /* +* both the virtual address and the physical one, as well as +* the size of the mapping, must be aligned (at least) to the +* size of the smallest page supported by the hardware +*/ + if (!IS_ALIGNED(iova | paddr | size, domain-ops-min_pagesz)) { + pr_err(unaligned: iova 0x%lx pa 0x%lx size 0x%x min_pagesz + 0x%x\n, iova, (unsigned long)paddr, + size, domain-ops-min_pagesz); + return -EINVAL; + } + + pr_debug(map: iova 0x%lx pa 0x%lx size 0x%x\n, iova, + (unsigned long)paddr, size); + + while (size) { + unsigned long pgsize, addr_merge = iova | paddr; + unsigned int pgsize_idx; + + /* Max page size that still fits into 'size' */ + pgsize_idx = __fls(size); + + /* need to consider alignment requirements ? */ + if
Re: kernel BUG at include/linux/kvm_host.h:603!
Hi Jörg, On 07.10.2011, at 23:10, Jörg Sommer wrote: Hi, I've got this backtrace: [130902.709711] [ cut here ] [130902.709747] kernel BUG at include/linux/kvm_host.h:603! Ouch. This means that preemption is broken in KVM for PPC. To quickly get things working on your side, please recompile your kernel with CONFIG_PREEMPT_NONE. I'll take a look at fixing it for real ASAP. Alex -- To unsubscribe from this list: send the line unsubscribe kvm-ppc in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html