Re: [PATCH 5/7] userfaultfd: switch to exclusive wakeup for blocking reads

2015-06-15 Thread Andrea Arcangeli
On Mon, Jun 15, 2015 at 08:19:07AM -1000, Linus Torvalds wrote: What if the process doing the polling never doors anything with the end result? Maybe it meant to, but it got killed before it could? Are you going to leave everybody else blocked, even though there are pending events? Yes, it

Re: linux-next: build warning after merge of the akpm-current tree

2015-06-04 Thread Andrea Arcangeli
ithout > #include > > Introduced by commit 2873f48b446c ("userfaultfd: uAPI"). Here's the fix: === >From 02b31b0a5e9dd5ddbcb4ad86f63fbcb0a2b5d8f2 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli Date: Thu, 4 Jun 2015 14:54:40 +0200 Subject: [PATCH] userfaultfd: uAPI: add

Re: linux-next: build warning after merge of the akpm-current tree

2015-06-04 Thread Andrea Arcangeli
/types.h Introduced by commit 2873f48b446c (userfaultfd: uAPI). Here's the fix: === From 02b31b0a5e9dd5ddbcb4ad86f63fbcb0a2b5d8f2 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli aarca...@redhat.com Date: Thu, 4 Jun 2015 14:54:40 +0200 Subject: [PATCH] userfaultfd: uAPI: add missing include/types.h

Re: [BUG] Read-Only THP causes stalls (commit 10359213d)

2015-05-26 Thread Andrea Arcangeli
On Tue, May 26, 2015 at 04:35:47PM +0200, Christoffer Dall wrote: > Any chance you could send me the memhog tool? memhog is just the first that come to mind because I got it preinstalled everywhere (I only miss it on cyanogenmod as there's no numactl there... yet). Anything else would do as

Re: [BUG] Read-Only THP causes stalls (commit 10359213d)

2015-05-26 Thread Andrea Arcangeli
On Tue, May 26, 2015 at 10:08:48AM +0200, Christoffer Dall wrote: > > echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan > > this returns -EINVAL. > Oops sorry, I haven't re-read the code, pages_to_scan 0 does not make sense, it would only be useful for debugging purposes

Re: [BUG] Read-Only THP causes stalls (commit 10359213d)

2015-05-26 Thread Andrea Arcangeli
On Tue, May 26, 2015 at 04:35:47PM +0200, Christoffer Dall wrote: Any chance you could send me the memhog tool? memhog is just the first that come to mind because I got it preinstalled everywhere (I only miss it on cyanogenmod as there's no numactl there... yet). Anything else would do as well,

Re: [BUG] Read-Only THP causes stalls (commit 10359213d)

2015-05-26 Thread Andrea Arcangeli
On Tue, May 26, 2015 at 10:08:48AM +0200, Christoffer Dall wrote: echo 0 /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan this returns -EINVAL. Oops sorry, I haven't re-read the code, pages_to_scan 0 does not make sense, it would only be useful for debugging purposes because

Re: [BUG] Read-Only THP causes stalls (commit 10359213d)

2015-05-25 Thread Andrea Arcangeli
Hello Christoffer, On Sun, May 24, 2015 at 09:34:04PM +0200, Christoffer Dall wrote: > Hi all, > > I noticed a regression on my arm64 APM X-Gene system a couple > of weeks back. I would occassionally see the system lock up and see RCU > stalls during the caching phase of kernbench. I then

Re: linux-next: build failure after merge of the akpm-current tree

2015-05-25 Thread Andrea Arcangeli
Hello, On Mon, May 25, 2015 at 09:18:01PM +1000, Stephen Rothwell wrote: > Hi Andrew, > > After merging the akpm-current tree, today's linux-next build (powerpc > ppc64_defconfig) failed like this: > > __NR_syscalls (364) is not one more than the last syscall (364) > > Caused by commit

Re: linux-next: build failure after merge of the akpm-current tree

2015-05-25 Thread Andrea Arcangeli
Hello, On Mon, May 25, 2015 at 09:18:01PM +1000, Stephen Rothwell wrote: Hi Andrew, After merging the akpm-current tree, today's linux-next build (powerpc ppc64_defconfig) failed like this: __NR_syscalls (364) is not one more than the last syscall (364) Caused by commit d7766613717b

Re: [BUG] Read-Only THP causes stalls (commit 10359213d)

2015-05-25 Thread Andrea Arcangeli
Hello Christoffer, On Sun, May 24, 2015 at 09:34:04PM +0200, Christoffer Dall wrote: Hi all, I noticed a regression on my arm64 APM X-Gene system a couple of weeks back. I would occassionally see the system lock up and see RCU stalls during the caching phase of kernbench. I then wrote a

Re: [PATCH 22/23] userfaultfd: avoid mmap_sem read recursion in mcopy_atomic

2015-05-22 Thread Andrea Arcangeli
ms like this. === >From 2f0a48670dc515932dec8b983871ec35caeba553 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli Date: Sat, 23 May 2015 02:26:32 +0200 Subject: [PATCH] userfaultfd: update the uffd_msg structure to be the same on 32/64bit Avoiding to using packed allowed the code to be nicer and it avoided the rese

Re: [PATCH 22/23] userfaultfd: avoid mmap_sem read recursion in mcopy_atomic

2015-05-22 Thread Andrea Arcangeli
On Fri, May 22, 2015 at 01:18:22PM -0700, Andrew Morton wrote: > On Thu, 14 May 2015 19:31:19 +0200 Andrea Arcangeli > wrote: > > > If the rwsem starves writers it wasn't strictly a bug but lockdep > > doesn't like it and this avoids depending on lowlevel implementation &g

Re: [PATCH 22/23] userfaultfd: avoid mmap_sem read recursion in mcopy_atomic

2015-05-22 Thread Andrea Arcangeli
On Fri, May 22, 2015 at 01:18:22PM -0700, Andrew Morton wrote: On Thu, 14 May 2015 19:31:19 +0200 Andrea Arcangeli aarca...@redhat.com wrote: If the rwsem starves writers it wasn't strictly a bug but lockdep doesn't like it and this avoids depending on lowlevel implementation details

Re: [PATCH 22/23] userfaultfd: avoid mmap_sem read recursion in mcopy_atomic

2015-05-22 Thread Andrea Arcangeli
2f0a48670dc515932dec8b983871ec35caeba553 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli aarca...@redhat.com Date: Sat, 23 May 2015 02:26:32 +0200 Subject: [PATCH] userfaultfd: update the uffd_msg structure to be the same on 32/64bit Avoiding to using packed allowed the code to be nicer and it avoided the reserved1/2/3

Re: [PATCH 00/23] userfaultfd v4

2015-05-21 Thread Andrea Arcangeli
Hi Kirill, On Thu, May 21, 2015 at 04:11:11PM +0300, Kirill Smelkov wrote: > Sorry for maybe speaking up too late, but here is additional real Not too late, in fact I don't think there's any change required for this at this stage, but it'd be great if you could help me to review. > Since arrays

Re: [PATCH 00/23] userfaultfd v4

2015-05-21 Thread Andrea Arcangeli
Hi Kirill, On Thu, May 21, 2015 at 04:11:11PM +0300, Kirill Smelkov wrote: Sorry for maybe speaking up too late, but here is additional real Not too late, in fact I don't think there's any change required for this at this stage, but it'd be great if you could help me to review. Since arrays

Re: [PATCH 00/23] userfaultfd v4

2015-05-20 Thread Andrea Arcangeli
Hello Richard, On Tue, May 19, 2015 at 11:59:42PM +0200, Richard Weinberger wrote: > On Tue, May 19, 2015 at 11:38 PM, Andrew Morton > wrote: > > On Thu, 14 May 2015 19:30:57 +0200 Andrea Arcangeli > > wrote: > > > >> This is the latest userfaultfd patchset ag

Re: [PATCH 00/23] userfaultfd v4

2015-05-20 Thread Andrea Arcangeli
Hi Andrew, On Tue, May 19, 2015 at 02:38:01PM -0700, Andrew Morton wrote: > On Thu, 14 May 2015 19:30:57 +0200 Andrea Arcangeli > wrote: > > > This is the latest userfaultfd patchset against mm-v4.1-rc3 > > 2015-05-14-10:04. > > It would be useful to have some userf

Re: [PATCH 00/23] userfaultfd v4

2015-05-20 Thread Andrea Arcangeli
Hi Andrew, On Tue, May 19, 2015 at 02:38:01PM -0700, Andrew Morton wrote: On Thu, 14 May 2015 19:30:57 +0200 Andrea Arcangeli aarca...@redhat.com wrote: This is the latest userfaultfd patchset against mm-v4.1-rc3 2015-05-14-10:04. It would be useful to have some userfaultfd testcases

Re: [PATCH 00/23] userfaultfd v4

2015-05-20 Thread Andrea Arcangeli
Hello Richard, On Tue, May 19, 2015 at 11:59:42PM +0200, Richard Weinberger wrote: On Tue, May 19, 2015 at 11:38 PM, Andrew Morton a...@linux-foundation.org wrote: On Thu, 14 May 2015 19:30:57 +0200 Andrea Arcangeli aarca...@redhat.com wrote: This is the latest userfaultfd patchset

Re: [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization

2015-05-15 Thread Andrea Arcangeli
On Thu, May 14, 2015 at 10:49:06AM -0700, Linus Torvalds wrote: > On Thu, May 14, 2015 at 10:31 AM, Andrea Arcangeli > wrote: > > +static __always_inline void wake_userfault(struct userfaultfd_ctx *ctx, > > + struct userfaultfd_wake

Re: [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization

2015-05-15 Thread Andrea Arcangeli
On Thu, May 14, 2015 at 10:49:06AM -0700, Linus Torvalds wrote: On Thu, May 14, 2015 at 10:31 AM, Andrea Arcangeli aarca...@redhat.com wrote: +static __always_inline void wake_userfault(struct userfaultfd_ctx *ctx, + struct userfaultfd_wake_range

[PATCH 18/23] userfaultfd: buildsystem activation

2015-05-14 Thread Andrea Arcangeli
This allows to select the userfaultfd during configuration to build it. Signed-off-by: Andrea Arcangeli --- fs/Makefile | 1 + init/Kconfig | 11 +++ 2 files changed, 12 insertions(+) diff --git a/fs/Makefile b/fs/Makefile index cb92fd4..53e59b2 100644 --- a/fs/Makefile +++ b/fs

[PATCH 20/23] userfaultfd: UFFDIO_COPY|UFFDIO_ZEROPAGE uAPI

2015-05-14 Thread Andrea Arcangeli
This implements the uABI of UFFDIO_COPY and UFFDIO_ZEROPAGE. Signed-off-by: Andrea Arcangeli --- include/uapi/linux/userfaultfd.h | 42 +++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux

[PATCH 19/23] userfaultfd: activate syscall

2015-05-14 Thread Andrea Arcangeli
This activates the userfaultfd syscall. Signed-off-by: Andrea Arcangeli --- arch/powerpc/include/asm/systbl.h | 1 + arch/powerpc/include/uapi/asm/unistd.h | 1 + arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl | 1 + include/linux/syscalls.h

[PATCH 15/23] userfaultfd: optimize read() and poll() to be O(1)

2015-05-14 Thread Andrea Arcangeli
This makes read O(1) and poll that was already O(1) becomes lockless. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 172 +++ 1 file changed, 98 insertions(+), 74 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index

[PATCH 17/23] userfaultfd: solve the race between UFFDIO_COPY|ZEROPAGE and read

2015-05-14 Thread Andrea Arcangeli
erfault thread This patch removes the need of both UFFDIO_WAKE and of the associated per-page tristate as well. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 81 +--- 1 file changed, 66 insertions(+), 15 deletions(-) diff --git a/fs/use

[PATCH 04/23] userfaultfd: linux/userfaultfd_k.h

2015-05-14 Thread Andrea Arcangeli
Kernel header defining the methods needed by the VM common code to interact with the userfaultfd. Signed-off-by: Andrea Arcangeli --- include/linux/userfaultfd_k.h | 79 +++ 1 file changed, 79 insertions(+) create mode 100644 include/linux

[PATCH 22/23] userfaultfd: avoid mmap_sem read recursion in mcopy_atomic

2015-05-14 Thread Andrea Arcangeli
If the rwsem starves writers it wasn't strictly a bug but lockdep doesn't like it and this avoids depending on lowlevel implementation details of the lock. Signed-off-by: Andrea Arcangeli --- mm/userfaultfd.c | 92 1 file changed, 66

[PATCH 12/23] userfaultfd: Rename uffd_api.bits into .features fixup

2015-05-14 Thread Andrea Arcangeli
Update comment. Signed-off-by: Andrea Arcangeli --- include/uapi/linux/userfaultfd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 5e1c2f7..03f21cb 100644 --- a/include/uapi/linux/userfaultfd.h +++ b

[PATCH 11/23] userfaultfd: Rename uffd_api.bits into .features

2015-05-14 Thread Andrea Arcangeli
Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 4 ++-- include/uapi/linux/userfaultfd.h | 10 -- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 1c9be61..9085365 100644 --- a/fs/userfaultfd.c +++ b/fs

[PATCH 00/23] userfaultfd v4

2015-05-14 Thread Andrea Arcangeli
ow QEMU/KVM uses userfaultfd to implement postcopy live migration. http://git.kernel.org/cgit/linux/kernel/git/andrea/aa.git/commit/?h=userfault=016f9523b7b2238851533736e84452cb00b2ddcd Andrea Arcangeli (22): userfaultfd: linux/Documentation/vm/userfaultfd.txt userfaultfd: waitqueue

[PATCH 09/23] userfaultfd: prevent khugepaged to merge if userfaultfd is armed

2015-05-14 Thread Andrea Arcangeli
Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c221be3..9671f51 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2198,7 +2198,8 @@ static int __collapse_huge_page_isol

[PATCH 14/23] userfaultfd: wake pending userfaults

2015-05-14 Thread Andrea Arcangeli
significant in case of repeated faults on the same address from multiple threads. This optimization is justified by the measurement that the number of spurious UFFDIO_WAKE accounts for 5% and 10% of the total userfaults for heavy workloads, so it's worth optimizing those away. Signed-off-

[PATCH 07/23] userfaultfd: call handle_userfault() for userfaultfd_missing() faults

2015-05-14 Thread Andrea Arcangeli
sed as parameter so the "read|write" kind of fault can be passed to userland. Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 69 ++-- mm/memory.c | 16 + 2 files changed, 63 insertions(+), 22 deletions(-) di

[PATCH 13/23] userfaultfd: change the read API to return a uffd_msg

2015-05-14 Thread Andrea Arcangeli
of new events that can be extended or of new future bits for already shipped events, is limited to 64 by the features field of the uffdio_api structure. If more will be needed a bump of UFFD_API will be required. Signed-off-by: Andrea Arcangeli --- Documentation/vm/userfaultfd.txt | 12 +++--- fs

[PATCH 01/23] userfaultfd: linux/Documentation/vm/userfaultfd.txt

2015-05-14 Thread Andrea Arcangeli
Add documentation. Signed-off-by: Andrea Arcangeli --- Documentation/vm/userfaultfd.txt | 140 +++ 1 file changed, 140 insertions(+) create mode 100644 Documentation/vm/userfaultfd.txt diff --git a/Documentation/vm/userfaultfd.txt b/Documentation/vm

[PATCH 23/23] userfaultfd: UFFDIO_COPY and UFFDIO_ZEROPAGE

2015-05-14 Thread Andrea Arcangeli
These two ioctl allows to either atomically copy or to map zeropages into the virtual address space. This is used by the thread that opened the userfaultfd to resolve the userfaults. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 96

[PATCH 10/23] userfaultfd: add new syscall to provide memory externalization

2015-05-14 Thread Andrea Arcangeli
to know when there are new pending userfaults to be read (POLLIN). Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 1008 ++ 1 file changed, 1008 insertions(+) create mode 100644 fs/userfaultfd.c diff --git a/fs/userfaultfd.c b/fs

[PATCH 08/23] userfaultfd: teach vma_merge to merge across vma->vm_userfaultfd_ctx

2015-05-14 Thread Andrea Arcangeli
vma->vm_userfaultfd_ctx is yet another vma parameter that vma_merge must be aware about so that we can merge vmas back like they were originally before arming the userfaultfd on some memory range. Signed-off-by: Andrea Arcangeli --- include/linux/mm.h | 2 +- mm/madvise.c | 3 ++-

[PATCH 16/23] userfaultfd: allocate the userfaultfd_ctx cacheline aligned

2015-05-14 Thread Andrea Arcangeli
Use proper slab to guarantee alignment. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 39 +++ 1 file changed, 31 insertions(+), 8 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3d26f41..5542fe7 100644 --- a/fs/userfaultfd.c

[PATCH 06/23] userfaultfd: add VM_UFFD_MISSING and VM_UFFD_WP

2015-05-14 Thread Andrea Arcangeli
These two flags gets set in vma->vm_flags to tell the VM common code if the userfaultfd is armed and in which mode (only tracking missing faults, only tracking wrprotect faults or both). If neither flags is set it means the userfaultfd is not armed on the vma. Signed-off-by: Andrea Arcang

[PATCH 21/23] userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation

2015-05-14 Thread Andrea Arcangeli
This implements mcopy_atomic and mfill_zeropage that are the lowlevel VM methods that are invoked respectively by the UFFDIO_COPY and UFFDIO_ZEROPAGE userfaultfd commands. Signed-off-by: Andrea Arcangeli --- include/linux/userfaultfd_k.h | 6 + mm/Makefile | 1 + mm

[PATCH 02/23] userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key

2015-05-14 Thread Andrea Arcangeli
userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead of the current hardcoded 1 (that would wake just the first waitqueue in the head list). Signed-off-by: Andrea Arcangeli --- include/linux/wait.h | 5 +++-- kernel/sched/wait.c | 7 --- net/sunrpc/sched.c | 2 +- 3

[PATCH 05/23] userfaultfd: add vm_userfaultfd_ctx to the vm_area_struct

2015-05-14 Thread Andrea Arcangeli
This adds the vm_userfaultfd_ctx to the vm_area_struct. Signed-off-by: Andrea Arcangeli --- include/linux/mm_types.h | 11 +++ kernel/fork.c| 1 + 2 files changed, 12 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 0038ac7..2836da7

[PATCH 03/23] userfaultfd: uAPI

2015-05-14 Thread Andrea Arcangeli
Defines the uAPI of the userfaultfd, notably the ioctl numbers and protocol. Signed-off-by: Andrea Arcangeli --- Documentation/ioctl/ioctl-number.txt | 1 + include/uapi/linux/Kbuild| 1 + include/uapi/linux/userfaultfd.h | 81 3 files

[PATCH 05/23] userfaultfd: add vm_userfaultfd_ctx to the vm_area_struct

2015-05-14 Thread Andrea Arcangeli
This adds the vm_userfaultfd_ctx to the vm_area_struct. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/mm_types.h | 11 +++ kernel/fork.c| 1 + 2 files changed, 12 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index

[PATCH 03/23] userfaultfd: uAPI

2015-05-14 Thread Andrea Arcangeli
Defines the uAPI of the userfaultfd, notably the ioctl numbers and protocol. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- Documentation/ioctl/ioctl-number.txt | 1 + include/uapi/linux/Kbuild| 1 + include/uapi/linux/userfaultfd.h | 81

[PATCH 15/23] userfaultfd: optimize read() and poll() to be O(1)

2015-05-14 Thread Andrea Arcangeli
This makes read O(1) and poll that was already O(1) becomes lockless. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/userfaultfd.c | 172 +++ 1 file changed, 98 insertions(+), 74 deletions(-) diff --git a/fs/userfaultfd.c b/fs

[PATCH 17/23] userfaultfd: solve the race between UFFDIO_COPY|ZEROPAGE and read

2015-05-14 Thread Andrea Arcangeli
of both UFFDIO_WAKE and of the associated per-page tristate as well. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/userfaultfd.c | 81 +--- 1 file changed, 66 insertions(+), 15 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c

[PATCH 13/23] userfaultfd: change the read API to return a uffd_msg

2015-05-14 Thread Andrea Arcangeli
of new events that can be extended or of new future bits for already shipped events, is limited to 64 by the features field of the uffdio_api structure. If more will be needed a bump of UFFD_API will be required. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- Documentation/vm/userfaultfd.txt

[PATCH 07/23] userfaultfd: call handle_userfault() for userfaultfd_missing() faults

2015-05-14 Thread Andrea Arcangeli
as parameter so the read|write kind of fault can be passed to userland. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- mm/huge_memory.c | 69 ++-- mm/memory.c | 16 + 2 files changed, 63 insertions(+), 22 deletions(-) diff --git

[PATCH 14/23] userfaultfd: wake pending userfaults

2015-05-14 Thread Andrea Arcangeli
of repeated faults on the same address from multiple threads. This optimization is justified by the measurement that the number of spurious UFFDIO_WAKE accounts for 5% and 10% of the total userfaults for heavy workloads, so it's worth optimizing those away. Signed-off-by: Andrea Arcangeli aarca

[PATCH 09/23] userfaultfd: prevent khugepaged to merge if userfaultfd is armed

2015-05-14 Thread Andrea Arcangeli
-by: Andrea Arcangeli aarca...@redhat.com --- mm/huge_memory.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c221be3..9671f51 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2198,7 +2198,8 @@ static int

[PATCH 00/23] userfaultfd v4

2015-05-14 Thread Andrea Arcangeli
QEMU/KVM uses userfaultfd to implement postcopy live migration. http://git.kernel.org/cgit/linux/kernel/git/andrea/aa.git/commit/?h=userfaultid=016f9523b7b2238851533736e84452cb00b2ddcd Andrea Arcangeli (22): userfaultfd: linux/Documentation/vm/userfaultfd.txt userfaultfd: waitqueue

[PATCH 04/23] userfaultfd: linux/userfaultfd_k.h

2015-05-14 Thread Andrea Arcangeli
Kernel header defining the methods needed by the VM common code to interact with the userfaultfd. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/userfaultfd_k.h | 79 +++ 1 file changed, 79 insertions(+) create mode 100644 include

[PATCH 22/23] userfaultfd: avoid mmap_sem read recursion in mcopy_atomic

2015-05-14 Thread Andrea Arcangeli
If the rwsem starves writers it wasn't strictly a bug but lockdep doesn't like it and this avoids depending on lowlevel implementation details of the lock. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- mm/userfaultfd.c | 92 1

[PATCH 12/23] userfaultfd: Rename uffd_api.bits into .features fixup

2015-05-14 Thread Andrea Arcangeli
Update comment. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/uapi/linux/userfaultfd.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 5e1c2f7..03f21cb 100644 --- a/include/uapi/linux

[PATCH 11/23] userfaultfd: Rename uffd_api.bits into .features

2015-05-14 Thread Andrea Arcangeli
-by: Pavel Emelyanov xe...@parallels.com Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/userfaultfd.c | 4 ++-- include/uapi/linux/userfaultfd.h | 10 -- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index

[PATCH 20/23] userfaultfd: UFFDIO_COPY|UFFDIO_ZEROPAGE uAPI

2015-05-14 Thread Andrea Arcangeli
This implements the uABI of UFFDIO_COPY and UFFDIO_ZEROPAGE. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/uapi/linux/userfaultfd.h | 42 +++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/userfaultfd.h b

[PATCH 18/23] userfaultfd: buildsystem activation

2015-05-14 Thread Andrea Arcangeli
This allows to select the userfaultfd during configuration to build it. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/Makefile | 1 + init/Kconfig | 11 +++ 2 files changed, 12 insertions(+) diff --git a/fs/Makefile b/fs/Makefile index cb92fd4..53e59b2 100644 --- a/fs

[PATCH 19/23] userfaultfd: activate syscall

2015-05-14 Thread Andrea Arcangeli
This activates the userfaultfd syscall. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- arch/powerpc/include/asm/systbl.h | 1 + arch/powerpc/include/uapi/asm/unistd.h | 1 + arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl | 1 + include/linux

[PATCH 02/23] userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key

2015-05-14 Thread Andrea Arcangeli
userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead of the current hardcoded 1 (that would wake just the first waitqueue in the head list). Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/wait.h | 5 +++-- kernel/sched/wait.c | 7 --- net/sunrpc

[PATCH 06/23] userfaultfd: add VM_UFFD_MISSING and VM_UFFD_WP

2015-05-14 Thread Andrea Arcangeli
These two flags gets set in vma-vm_flags to tell the VM common code if the userfaultfd is armed and in which mode (only tracking missing faults, only tracking wrprotect faults or both). If neither flags is set it means the userfaultfd is not armed on the vma. Signed-off-by: Andrea Arcangeli aarca

[PATCH 21/23] userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation

2015-05-14 Thread Andrea Arcangeli
This implements mcopy_atomic and mfill_zeropage that are the lowlevel VM methods that are invoked respectively by the UFFDIO_COPY and UFFDIO_ZEROPAGE userfaultfd commands. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/userfaultfd_k.h | 6 + mm/Makefile

[PATCH 16/23] userfaultfd: allocate the userfaultfd_ctx cacheline aligned

2015-05-14 Thread Andrea Arcangeli
Use proper slab to guarantee alignment. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/userfaultfd.c | 39 +++ 1 file changed, 31 insertions(+), 8 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 3d26f41..5542fe7 100644 --- a/fs

[PATCH 01/23] userfaultfd: linux/Documentation/vm/userfaultfd.txt

2015-05-14 Thread Andrea Arcangeli
Add documentation. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- Documentation/vm/userfaultfd.txt | 140 +++ 1 file changed, 140 insertions(+) create mode 100644 Documentation/vm/userfaultfd.txt diff --git a/Documentation/vm/userfaultfd.txt b

[PATCH 10/23] userfaultfd: add new syscall to provide memory externalization

2015-05-14 Thread Andrea Arcangeli
to know when there are new pending userfaults to be read (POLLIN). Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/userfaultfd.c | 1008 ++ 1 file changed, 1008 insertions(+) create mode 100644 fs/userfaultfd.c diff --git a/fs

[PATCH 08/23] userfaultfd: teach vma_merge to merge across vma-vm_userfaultfd_ctx

2015-05-14 Thread Andrea Arcangeli
vma-vm_userfaultfd_ctx is yet another vma parameter that vma_merge must be aware about so that we can merge vmas back like they were originally before arming the userfaultfd on some memory range. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- include/linux/mm.h | 2 +- mm/madvise.c

[PATCH 23/23] userfaultfd: UFFDIO_COPY and UFFDIO_ZEROPAGE

2015-05-14 Thread Andrea Arcangeli
These two ioctl allows to either atomically copy or to map zeropages into the virtual address space. This is used by the thread that opened the userfaultfd to resolve the userfaults. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/userfaultfd.c | 96

Re: [PATCH 2/3] uffd: Introduce the v2 API

2015-04-27 Thread Andrea Arcangeli
Hello, On Thu, Apr 23, 2015 at 09:29:07AM +0300, Pavel Emelyanov wrote: > So your proposal is to always report 16 bytes per PF from read() and > let userspace decide itself how to handle the result? Reading 16bytes for each userfault (instead of 8) and sharing the same read(2) protocol

Re: [PATCH 2/3] uffd: Introduce the v2 API

2015-04-27 Thread Andrea Arcangeli
Hello, On Thu, Apr 23, 2015 at 09:29:07AM +0300, Pavel Emelyanov wrote: So your proposal is to always report 16 bytes per PF from read() and let userspace decide itself how to handle the result? Reading 16bytes for each userfault (instead of 8) and sharing the same read(2) protocol (UFFD_API)

Re: [PATCH 2/3] uffd: Introduce the v2 API

2015-04-21 Thread Andrea Arcangeli
On Wed, Mar 18, 2015 at 10:35:17PM +0300, Pavel Emelyanov wrote: > + if (!(ctx->features & UFFD_FEATURE_LONGMSG)) { If we are to use different protocols, it'd be nicer to have two different methods to assign to userfaultfd_fops.read that calls an __always_inline function, so that the

Re: [PATCH 0/3] UserfaultFD: Extension for non cooperative uffd usage

2015-04-21 Thread Andrea Arcangeli
Hi Pavel, On Wed, Mar 18, 2015 at 10:34:26PM +0300, Pavel Emelyanov wrote: > Hi, > > On the recent LSF Andrea presented his userfault-fd patches and > I had shown some issues that appear in usage scenarios when the > monitor task and mm task do not cooperate to each other on VM > changes (and

Re: [PATCH 0/3] UserfaultFD: Extension for non cooperative uffd usage

2015-04-21 Thread Andrea Arcangeli
Hi Pavel, On Wed, Mar 18, 2015 at 10:34:26PM +0300, Pavel Emelyanov wrote: Hi, On the recent LSF Andrea presented his userfault-fd patches and I had shown some issues that appear in usage scenarios when the monitor task and mm task do not cooperate to each other on VM changes (and

Re: [PATCH 2/3] uffd: Introduce the v2 API

2015-04-21 Thread Andrea Arcangeli
On Wed, Mar 18, 2015 at 10:35:17PM +0300, Pavel Emelyanov wrote: + if (!(ctx-features UFFD_FEATURE_LONGMSG)) { If we are to use different protocols, it'd be nicer to have two different methods to assign to userfaultfd_fops.read that calls an __always_inline function, so that the

[PATCH 18/21] userfaultfd: UFFDIO_REMAP uABI

2015-03-05 Thread Andrea Arcangeli
This implements the uABI of UFFDIO_REMAP. Notably one mode bitflag is also forwarded (and in turn known) by the lowlevel remap_pages method. Signed-off-by: Andrea Arcangeli --- include/uapi/linux/userfaultfd.h | 27 ++- 1 file changed, 26 insertions(+), 1 deletion

[PATCH 09/21] userfaultfd: prevent khugepaged to merge if userfaultfd is armed

2015-03-05 Thread Andrea Arcangeli
Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 5374132..8f1b6a5 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2145,7 +2145,8 @@ static int __collapse_huge_page_isol

Re: [PATCH 19/21] userfaultfd: remap_pages: UFFDIO_REMAP preparation

2015-03-05 Thread Andrea Arcangeli
On Thu, Mar 05, 2015 at 09:39:48AM -0800, Linus Torvalds wrote: > Is this really worth it? On real loads? That people are expected to use? I fully agree that it's not worth merging upstream UFFDIO_REMAP until (and if) a real world usage for it will showup. To further clarify: would this not have

[PATCH 05/21] userfaultfd: add vm_userfaultfd_ctx to the vm_area_struct

2015-03-05 Thread Andrea Arcangeli
This adds the vm_userfaultfd_ctx to the vm_area_struct. Signed-off-by: Andrea Arcangeli --- include/linux/mm_types.h | 11 +++ kernel/fork.c| 1 + 2 files changed, 12 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 199a03a..fbf21f5

[PATCH 03/21] userfaultfd: uAPI

2015-03-05 Thread Andrea Arcangeli
Defines the uAPI of the userfaultfd, notably the ioctl numbers and protocol. Signed-off-by: Andrea Arcangeli --- Documentation/ioctl/ioctl-number.txt | 1 + include/uapi/linux/userfaultfd.h | 81 2 files changed, 82 insertions(+) create mode 100644

[PATCH 11/21] userfaultfd: buildsystem activation

2015-03-05 Thread Andrea Arcangeli
This allows to select the userfaultfd during configuration to build it. Signed-off-by: Andrea Arcangeli --- fs/Makefile | 1 + init/Kconfig | 11 +++ 2 files changed, 12 insertions(+) diff --git a/fs/Makefile b/fs/Makefile index a88ac48..ba8ab62 100644 --- a/fs/Makefile +++ b/fs

[PATCH 17/21] userfaultfd: remap_pages: swp_entry_swapcount() preparation

2015-03-05 Thread Andrea Arcangeli
in some anon_vma. Signed-off-by: Andrea Arcangeli --- include/linux/swap.h | 6 ++ mm/swapfile.c| 13 + 2 files changed, 19 insertions(+) diff --git a/include/linux/swap.h b/include/linux/swap.h index 4759491..9adda11 100644 --- a/include/linux/swap.h +++ b/include/linux

[PATCH 20/21] userfaultfd: UFFDIO_REMAP

2015-03-05 Thread Andrea Arcangeli
specially if copying only a few pages at time, copying without TLB flush is faster. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 51 +++ 1 file changed, 51 insertions(+) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 6230f22..b4c

[PATCH 21/21] userfaultfd: add userfaultfd_wp mm helpers

2015-03-05 Thread Andrea Arcangeli
These helpers will be used to know if to call handle_userfault() during wrprotect faults in order to deliver the wrprotect faults to userland. Signed-off-by: Andrea Arcangeli --- include/linux/userfaultfd_k.h | 10 ++ 1 file changed, 10 insertions(+) diff --git a/include/linux

[PATCH 16/21] userfaultfd: remap_pages: rmap preparation

2015-03-05 Thread Andrea Arcangeli
e remap_pages runs. Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 23 +++ mm/rmap.c| 9 + 2 files changed, 28 insertions(+), 4 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 8f1b6a5..1e25cb3 100644 --- a/mm/huge_memory.c +++

[PATCH 06/21] userfaultfd: add VM_UFFD_MISSING and VM_UFFD_WP

2015-03-05 Thread Andrea Arcangeli
These two flags gets set in vma->vm_flags to tell the VM common code if the userfaultfd is armed and in which mode (only tracking missing faults, only tracking wrprotect faults or both). If neither flags is set it means the userfaultfd is not armed on the vma. Signed-off-by: Andrea Arcang

[PATCH 02/21] userfaultfd: linux/Documentation/vm/userfaultfd.txt

2015-03-05 Thread Andrea Arcangeli
Add documentation. Signed-off-by: Andrea Arcangeli --- Documentation/vm/userfaultfd.txt | 97 1 file changed, 97 insertions(+) create mode 100644 Documentation/vm/userfaultfd.txt diff --git a/Documentation/vm/userfaultfd.txt b/Documentation/vm

[PATCH 00/21] RFC: userfaultfd v3

2015-03-05 Thread Andrea Arcangeli
strictly required by the wrprotect tracking mode, so it's no problem to solve this later. Because of its inherent racy nature, nobody could possibly depend on a racy SIGBUS being raised now, when it won't be raised anymore later. Andrea Arcangeli (21): userfaultfd: waitqueue: add nr wake parameter

[PATCH 08/21] userfaultfd: teach vma_merge to merge across vma->vm_userfaultfd_ctx

2015-03-05 Thread Andrea Arcangeli
vma->vm_userfaultfd_ctx is yet another vma parameter that vma_merge must be aware about so that we can merge vmas back like they were originally before arming the userfaultfd on some memory range. Signed-off-by: Andrea Arcangeli --- include/linux/mm.h | 2 +- mm/madvise.c | 3 ++-

[PATCH 13/21] userfaultfd: UFFDIO_COPY|UFFDIO_ZEROPAGE uAPI

2015-03-05 Thread Andrea Arcangeli
This implements the uABI of UFFDIO_COPY and UFFDIO_ZEROPAGE. Signed-off-by: Andrea Arcangeli --- include/uapi/linux/userfaultfd.h | 46 +++- 1 file changed, 45 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux

[PATCH 10/21] userfaultfd: add new syscall to provide memory externalization

2015-03-05 Thread Andrea Arcangeli
to know when there are new pending userfaults to be read (POLLIN). Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 977 +++ 1 file changed, 977 insertions(+) create mode 100644 fs/userfaultfd.c diff --git a/fs/userfaultfd.c b/fs

[PATCH 19/21] userfaultfd: remap_pages: UFFDIO_REMAP preparation

2015-03-05 Thread Andrea Arcangeli
remap_pages is the lowlevel mm helper needed to implement UFFDIO_REMAP. Signed-off-by: Andrea Arcangeli --- include/linux/userfaultfd_k.h | 17 ++ mm/huge_memory.c | 120 ++ mm/userfaultfd.c | 526 ++ 3 files changed

[PATCH 01/21] userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key

2015-03-05 Thread Andrea Arcangeli
userfaultfd needs to wake all waitqueues (pass 0 as nr parameter), instead of the current hardcoded 1 (that would wake just the first waitqueue in the head list). Signed-off-by: Andrea Arcangeli --- include/linux/wait.h | 5 +++-- kernel/sched/wait.c | 7 --- net/sunrpc/sched.c | 2 +- 3

[PATCH 04/21] userfaultfd: linux/userfaultfd_k.h

2015-03-05 Thread Andrea Arcangeli
Kernel header defining the methods needed by the VM common code to interact with the userfaultfd. Signed-off-by: Andrea Arcangeli --- include/linux/userfaultfd_k.h | 79 +++ 1 file changed, 79 insertions(+) create mode 100644 include/linux

[PATCH 07/21] userfaultfd: call handle_userfault() for userfaultfd_missing() faults

2015-03-05 Thread Andrea Arcangeli
sed as parameter so the "read|write" kind of fault can be passed to userland. Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 68 ++-- mm/memory.c | 16 + 2 files changed, 62 insertions(+), 22 deletions(-) di

[PATCH 15/21] userfaultfd: UFFDIO_COPY and UFFDIO_ZEROPAGE

2015-03-05 Thread Andrea Arcangeli
These two ioctl allows to either atomically copy or to map zeropages into the virtual address space. This is used by the thread that opened the userfaultfd to resolve the userfaults. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 100

[PATCH 14/21] userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation

2015-03-05 Thread Andrea Arcangeli
This implements mcopy_atomic and mfill_zeropage that are the lowlevel VM methods that are invoked respectively by the UFFDIO_COPY and UFFDIO_ZEROPAGE userfaultfd commands. Signed-off-by: Andrea Arcangeli --- include/linux/userfaultfd_k.h | 6 + mm/Makefile | 1 + mm

[PATCH 12/21] userfaultfd: activate syscall

2015-03-05 Thread Andrea Arcangeli
This activates the userfaultfd syscall. Signed-off-by: Andrea Arcangeli --- arch/powerpc/include/asm/systbl.h | 1 + arch/powerpc/include/asm/unistd.h | 2 +- arch/powerpc/include/uapi/asm/unistd.h | 1 + arch/x86/syscalls/syscall_32.tbl | 1 + arch/x86/syscalls/syscall_64.tbl

<    5   6   7   8   9   10   11   12   13   14   >