Re: [PATCH] KVM: PPC: Book3S HV: return htab entries in big endian

2014-10-03 Thread Paul Mackerras
On Thu, Oct 02, 2014 at 07:06:40PM +0200, Alexander Graf wrote: I think we're best off to keep the user space API native endian. So really we should only ever have to convert from big to native endian on read and native to big on write. With that QEMU should do the right thing already, no?

Re: [PATCH v2] KVM: x86: some apic broadcast modes does not work

2014-10-03 Thread Radim Krčmář
2014-10-03 00:30+0300, Nadav Amit: KVM does not deliver x2APIC broadcast messages with physical mode. Intel SDM (10.12.9 ICR Operation in x2APIC Mode) states: A destination ID value of _H is used for broadcast of interrupts in both logical destination and physical destination modes.

Re: [PATCH 3/6] KVM: x86: NoBigReal was mistakenly considering la instead of ea

2014-10-03 Thread Radim Krčmář
2014-10-02 17:52+0300, Nadav Amit: 2014-09-30 20:49+0300, Nadav Amit: NoBigReal emulation should consider the effective address is between 0 and 0x instead of checking the logical address. [...] Please don’t apply this patch (only this one - 3/6). I observe strange behaviour which

Re: [PATCH v5 3/4] kvmtool: Handle exit reason KVM_EXIT_SYSTEM_EVENT

2014-10-03 Thread Will Deacon
On Wed, Oct 01, 2014 at 11:34:54AM +0100, Anup Patel wrote: The KVM_EXIT_SYSTEM_EVENT exit reason was added to define architecture independent system-wide events for a Guest. Currently, it is used by in-kernel PSCI-0.2 emulation of KVM ARM/ARM64 to inform user space about PSCI SYSTEM_OFF or

Re: [PATCH v5 0/4] kvmtool: ARM/ARM64: Misc updates

2014-10-03 Thread Will Deacon
On Wed, Oct 01, 2014 at 11:34:51AM +0100, Anup Patel wrote: This patchset updates KVMTOOL to use some of the features supported by Linux-3.16 KVM ARM/ARM64, such as: 1. Target CPU == Host using KVM_ARM_PREFERRED_TARGET vm ioctl 2. Target CPU type Potenza for using KVMTOOL on X-Gene 3. PSCI

[PATCH 11/17] mm: swp_entry_swapcount

2014-10-03 Thread Andrea Arcangeli
Provide a new swapfile method for remap_anon_pages to verify the swap entry is mapped only in one vma before relocating the swap entry in a different virtual address. Otherwise if the swap entry is mapped in multiple vmas, when the page is swapped back in, it could get mapped in a non linear way

[PATCH 16/17] powerpc: add remap_anon_pages and userfaultfd

2014-10-03 Thread Andrea Arcangeli
Add the syscall numbers. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- arch/powerpc/include/asm/systbl.h | 2 ++ arch/powerpc/include/asm/unistd.h | 2 +- arch/powerpc/include/uapi/asm/unistd.h | 2 ++ 3 files changed, 5 insertions(+), 1 deletion(-) diff --git

[PATCH 15/17] userfaultfd: make userfaultfd_write non blocking

2014-10-03 Thread Andrea Arcangeli
It is generally inefficient to ask the wakeup of userfault ranges where there's not a single userfault address read through userfaultfd_read earlier and in turn waiting a wakeup. However it may come handy to wakeup the same userfault range twice in case of multiple thread faulting on the same

[PATCH 13/17] waitqueue: add nr wake parameter to __wake_up_locked_key

2014-10-03 Thread Andrea Arcangeli
Userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead of the current hardcoded 1 (that would wake just the first waitqueue in the head list). Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/wait.h | 5 +++-- kernel/sched/wait.c | 7 ---

[PATCH 03/17] mm: gup: use get_user_pages_unlocked within get_user_pages_fast

2014-10-03 Thread Andrea Arcangeli
Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- arch/mips/mm/gup.c | 8 +++- arch/powerpc/mm/gup.c| 6 ++ arch/s390/kvm/kvm-s390.c | 4 +--- arch/s390/mm/gup.c | 6 ++ arch/sh/mm/gup.c | 6 ++ arch/sparc/mm/gup.c | 6 ++ arch/x86/mm/gup.c

[PATCH 14/17] userfaultfd: add new syscall to provide memory externalization

2014-10-03 Thread Andrea Arcangeli
Once an userfaultfd is created MADV_USERFAULT regions talks through the userfaultfd protocol with the thread responsible for doing the memory externalization of the process. The protocol starts by userland writing the requested/preferred USERFAULT_PROTOCOL version into the userfault fd (64bit

[PATCH 01/17] mm: gup: add FOLL_TRIED

2014-10-03 Thread Andrea Arcangeli
From: Andres Lagar-Cavilla andre...@google.com Reviewed-by: Radim Krčmář rkrc...@redhat.com Signed-off-by: Andres Lagar-Cavilla andre...@google.com Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/mm.h | 1 + mm/gup.c | 4 2 files changed, 5 insertions(+)

[PATCH 06/17] kvm: Faults which trigger IO release the mmap_sem

2014-10-03 Thread Andrea Arcangeli
From: Andres Lagar-Cavilla andre...@google.com When KVM handles a tdp fault it uses FOLL_NOWAIT. If the guest memory has been swapped out or is behind a filemap, this will trigger async readahead and return immediately. The rationale is that KVM will kick back the guest with an async page fault

[PATCH 04/17] mm: gup: make get_user_pages_fast and __get_user_pages_fast latency conscious

2014-10-03 Thread Andrea Arcangeli
This teaches gup_fast and __gup_fast to re-enable irqs and cond_resched() if possible every BATCH_PAGES. This must be implemented by other archs as well and it's a requirement before converting more get_user_pages() to get_user_pages_fast() as an optimization (instead of using

[PATCH 00/17] RFC: userfault v2

2014-10-03 Thread Andrea Arcangeli
Hello everyone, There's a large To/Cc list for this RFC because this adds two new syscalls (userfaultfd and remap_anon_pages) and MADV_USERFAULT/MADV_NOUSERFAULT, so suggestions on changes are welcome sooner than later. The major change compared to the previous RFC I sent a few months ago is

[PATCH 09/17] mm: PT lock: export double_pt_lock/unlock

2014-10-03 Thread Andrea Arcangeli
Those two helpers are needed by remap_anon_pages. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/mm.h | 4 mm/fremap.c| 29 + 2 files changed, 33 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index

[PATCH 07/17] mm: madvise MADV_USERFAULT: prepare vm_flags to allow more than 32bits

2014-10-03 Thread Andrea Arcangeli
We run out of 32bits in vm_flags, noop change for 64bit archs. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/proc/task_mmu.c | 4 ++-- include/linux/huge_mm.h | 4 ++-- include/linux/ksm.h | 4 ++-- include/linux/mm_types.h | 2 +- mm/huge_memory.c | 2 +-

Query with respect to VCPU scheduling

2014-10-03 Thread Mohan Kumar
Hello all, I am new to KVM and look at a particular case in KVM. I know that KVM uses CFS. The question I have is as follows: - I have two VMs with 2 VCPU each. And the actual CPUs are also 2 and both the VMs use the 2 CPUs. I want one VM to use only 25 % of the actual CPU and the remaining

Re: [Qemu-devel] QEMU with KVM does not start Win8 on kernel 3.4.67 and core2duo

2014-10-03 Thread Erik Rull
Erik Rull wrote: On September 12, 2014 at 7:29 PM Jan Kiszka jan.kis...@siemens.com wrote: On 2014-09-12 19:15, Jan Kiszka wrote: On 2014-09-12 14:29, Erik Rull wrote: On September 11, 2014 at 3:32 PM Jan Kiszka jan.kis...@siemens.com wrote: On 2014-09-11 15:25, Erik Rull wrote: On August

[PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-03 Thread Andrea Arcangeli
remap_anon_pages (unlike remap_file_pages) tries to be non intrusive in the rmap code. As far as the rmap code is concerned, rmap_anon_pages only alters the page-mapping and page-index. It does it while holding the page lock. However there are a few places that in presence of anon pages are

[PATCH 05/17] mm: gup: use get_user_pages_fast and get_user_pages_unlocked

2014-10-03 Thread Andrea Arcangeli
Just an optimization. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- drivers/dma/iovlock.c | 10 ++ drivers/iommu/amd_iommu_v2.c | 6 ++ drivers/media/pci/ivtv/ivtv-udma.c | 6 ++ drivers/scsi/st.c | 10 ++

[PATCH 02/17] mm: gup: add get_user_pages_locked and get_user_pages_unlocked

2014-10-03 Thread Andrea Arcangeli
We can leverage the VM_FAULT_RETRY functionality in the page fault paths better by using either get_user_pages_locked or get_user_pages_unlocked. The former allow conversion of get_user_pages invocations that will have to pass a locked parameter to know if the mmap_sem was dropped during the

[PATCH 17/17] userfaultfd: implement USERFAULTFD_RANGE_REGISTER|UNREGISTER

2014-10-03 Thread Andrea Arcangeli
This adds two protocol commands to the userfaultfd protocol. To register memory regions into userfaultfd you can write 16 bytes as: [ start|0x1, end ] to unregister write: [ start|0x2, end ] End is start+len (not start+len-1). Same as vma-vm_end. This also enforces the

[PATCH 12/17] mm: sys_remap_anon_pages

2014-10-03 Thread Andrea Arcangeli
This new syscall will move anon pages across vmas, atomically and without touching the vmas. It only works on non shared anonymous pages because those can be relocated without generating non linear anon_vmas in the rmap code. It is the ideal mechanism to handle userspace page faults. Normally

Re: [PATCH 01/17] mm: gup: add FOLL_TRIED

2014-10-03 Thread Linus Torvalds
This needs more explanation than that one-liner comment. Make the commit message explain why the new FOLL_TRIED flag exists. Linus On Fri, Oct 3, 2014 at 10:07 AM, Andrea Arcangeli aarca...@redhat.com wrote: From: Andres Lagar-Cavilla andre...@google.com Reviewed-by: Radim Krčmář

[PATCH 08/17] mm: madvise MADV_USERFAULT

2014-10-03 Thread Andrea Arcangeli
MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the vma flags. Whenever VM_USERFAULT is set in an anonymous vma, if userland touches a still unmapped virtual address, a sigbus signal is sent instead of allocating a new page. The sigbus signal handler will then resolve the page

Re: [PATCH 04/17] mm: gup: make get_user_pages_fast and __get_user_pages_fast latency conscious

2014-10-03 Thread Linus Torvalds
On Fri, Oct 3, 2014 at 10:07 AM, Andrea Arcangeli aarca...@redhat.com wrote: This teaches gup_fast and __gup_fast to re-enable irqs and cond_resched() if possible every BATCH_PAGES. This is disgusting. Many (most?) __gup_fast() users just want a single page, and the stupid overhead of the

Re: [PATCH 10/17] mm: rmap preparation for remap_anon_pages

2014-10-03 Thread Linus Torvalds
On Fri, Oct 3, 2014 at 10:08 AM, Andrea Arcangeli aarca...@redhat.com wrote: Overall this looks a fairly small change to the rmap code, notably less intrusive than the nonlinear vmas created by remap_file_pages. Considering that remap_file_pages() was an unmitigated disaster, and -mm has a

[PATCH v3 3/6] target-i386: Disable CPUID_ACPI by default on KVM mode

2014-10-03 Thread Eduardo Habkost
KVM never supported the CPUID_ACPI flag, so it doesn't make sense to have it enabled by default when KVM is enabled. The motivation here is exactly the same we had for the MONITOR flag (disabled by commit 136a7e9a85d7047461f8153f7d12c514a3d68f69). And like on the MONITOR flag case, we don't need

[PATCH v3 0/6] target-i386: Make most CPU models work with enforce out of the box

2014-10-03 Thread Eduardo Habkost
Changes v2 - v3: * None. This is just a rebase against latest qemu.git master (commit b00a0dd) Changes v1 - v2: * Commit message and comment changes. * Update compat code to change pc-*-2.1, not pc-*-2.0. * Added patch to disable SVM by default in KVM mode. Most of the bits that make enforce

[PATCH v3 1/6] pc: Create pc_compat_2_1() functions

2014-10-03 Thread Eduardo Habkost
We will need new compat code for the 2.1 machine-types. Signed-off-by: Eduardo Habkost ehabk...@redhat.com --- hw/i386/pc_piix.c | 13 - hw/i386/pc_q35.c | 13 - 2 files changed, 24 insertions(+), 2 deletions(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index

[PATCH v3 2/6] target-i386: Rename KVM auto-feature-enable compat function

2014-10-03 Thread Eduardo Habkost
The x86_cpu_compat_disable_kvm_features() name was a bit confusing, as it won't forcibly disable the feature for all CPU models (i.e. add it to kvm_default_unset_features), but it will instead turn off the KVM auto-enabling of the feature (i.e. remove it from kvm_default_features), meaning the

[PATCH v3 4/6] target-i386: Remove unsupported bits from all CPU models

2014-10-03 Thread Eduardo Habkost
The following CPU features were never supported by neither TCG or KVM, so they are useless on the CPU model definitions, today: * CPUID_DTS (DS) * CPUID_HT * CPUID_TM * CPUID_PBE * CPUID_EXT_DTES64 * CPUID_EXT_DSCPL * CPUID_EXT_EST * CPUID_EXT_TM2 * CPUID_EXT_XTPR * CPUID_EXT_PDCM *

[PATCH v3 5/6] target-i386: Don't enable nested VMX by default

2014-10-03 Thread Eduardo Habkost
TCG doesn't support VMX, and nested VMX is not enabled by default on the KVM kernel module. So, there's no reason to have VMX enabled by default on the core2duo and coreduo CPU models, today. Even the newer Intel CPU model definitions don't have it enabled. In this case, we need machine-type

[PATCH v3 6/6] target-i386: Disable SVM by default in KVM mode

2014-10-03 Thread Eduardo Habkost
Make SVM be disabled by default on all CPU models when in KVM mode. Nested SVM is enabled by default in the KVM kernel module, but it is probably less stable than nested VMX (which is already disabled by default). Add a new compat function, x86_cpu_compat_kvm_no_autodisable(), to keep

Re: [PATCH 01/17] mm: gup: add FOLL_TRIED

2014-10-03 Thread Paolo Bonzini
This needs more explanation than that one-liner comment. Make the commit message explain why the new FOLL_TRIED flag exists. This patch actually is extracted from a 3.18 commit in the KVM tree, https://git.kernel.org/cgit/virt/kvm/kvm.git/commit/?h=nextid=234b239b. Here is how that patch uses

Re: [PATCH] KVM: PPC: Book3S HV: return htab entries in big endian

2014-10-03 Thread Alexander Graf
Am 03.10.2014 um 14:05 schrieb Paul Mackerras pau...@samba.org: On Thu, Oct 02, 2014 at 07:06:40PM +0200, Alexander Graf wrote: I think we're best off to keep the user space API native endian. So really we should only ever have to convert from big to native endian on read and native to

Re: [PATCH v3 0/6] target-i386: Make most CPU models work with enforce out of the box

2014-10-03 Thread Paolo Bonzini
Il 03/10/2014 21:39, Eduardo Habkost ha scritto: Changes v2 - v3: * None. This is just a rebase against latest qemu.git master (commit b00a0dd) Changes v1 - v2: * Commit message and comment changes. * Update compat code to change pc-*-2.1, not pc-*-2.0. * Added patch to disable SVM by

Re: [PATCH 08/17] mm: madvise MADV_USERFAULT

2014-10-03 Thread Mike Hommey
On Fri, Oct 03, 2014 at 07:07:58PM +0200, Andrea Arcangeli wrote: MADV_USERFAULT is a new madvise flag that will set VM_USERFAULT in the vma flags. Whenever VM_USERFAULT is set in an anonymous vma, if userland touches a still unmapped virtual address, a sigbus signal is sent instead of

Possible to backport this vhost-net fix to 3.10?

2014-10-03 Thread Eddie Chapman
Hi, I've been regularly seeing on the 3.10 stable kernels the same problem as reported by Romain Francoise here: https://lkml.org/lkml/2013/1/23/492 An example from my setup is at the bottom of this mail. It's a problem as qemu fails to run when it hits this, only solution is to do all qemu

Re: [PATCH] KVM: PPC: Book3S HV: return htab entries in big endian

2014-10-03 Thread Alexey Kardashevskiy
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 10/04/2014 07:05 AM, Alexander Graf wrote: Am 03.10.2014 um 14:05 schrieb Paul Mackerras pau...@samba.org: On Thu, Oct 02, 2014 at 07:06:40PM +0200, Alexander Graf wrote: I think we're best off to keep the user space API native endian.

Re: [PATCH] KVM: PPC: Book3S HV: return htab entries in big endian

2014-10-03 Thread Paul Mackerras
On Thu, Oct 02, 2014 at 07:06:40PM +0200, Alexander Graf wrote: I think we're best off to keep the user space API native endian. So really we should only ever have to convert from big to native endian on read and native to big on write. With that QEMU should do the right thing already, no?

Re: [PATCH] KVM: PPC: Book3S HV: return htab entries in big endian

2014-10-03 Thread Alexander Graf
Am 03.10.2014 um 14:05 schrieb Paul Mackerras pau...@samba.org: On Thu, Oct 02, 2014 at 07:06:40PM +0200, Alexander Graf wrote: I think we're best off to keep the user space API native endian. So really we should only ever have to convert from big to native endian on read and native to

Re: [PATCH] KVM: PPC: Book3S HV: return htab entries in big endian

2014-10-03 Thread Alexey Kardashevskiy
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 10/04/2014 07:05 AM, Alexander Graf wrote: Am 03.10.2014 um 14:05 schrieb Paul Mackerras pau...@samba.org: On Thu, Oct 02, 2014 at 07:06:40PM +0200, Alexander Graf wrote: I think we're best off to keep the user space API native endian.