Re: fs: uninterruptible hang in handle_userfault

2016-03-02 Thread Andrea Arcangeli
Hello, On Wed, Mar 02, 2016 at 12:48:46AM +, Al Viro wrote: > On Tue, Mar 01, 2016 at 12:06:49PM -0800, Linus Torvalds wrote: > > > So the only access we really care about is the child tid-pointer > > clearing one, and that always happens after PF_EXITING has been set > > afaik. > > > > No

Re: [PATCH 1/1] mm: thp: Redefine default THP defrag behaviour disable it by default

2016-02-26 Thread Andrea Arcangeli
Hello Mel, On Fri, Feb 26, 2016 at 11:13:16AM +, Mel Gorman wrote: > 1. By default, "madvise" and direct reclaim/compaction for applications >that specifically requested that behaviour. This will avoid breaking >MADV_HUGEPAGE which you mentioned in a few places Defragging memory

Re: [PATCH 1/1] mm: thp: Redefine default THP defrag behaviour disable it by default

2016-02-26 Thread Andrea Arcangeli
Hello Mel, On Fri, Feb 26, 2016 at 11:13:16AM +, Mel Gorman wrote: > 1. By default, "madvise" and direct reclaim/compaction for applications >that specifically requested that behaviour. This will avoid breaking >MADV_HUGEPAGE which you mentioned in a few places Defragging memory

Re: [PATCH 1/1] mm: thp: Redefine default THP defrag behaviour disable it by default

2016-02-25 Thread Andrea Arcangeli
On Fri, Feb 26, 2016 at 12:02:19AM +0100, Andrea Arcangeli wrote: > Let's first agree if direct compaction is going to hurt also for the > MADV_HUGEPAGE case. I say MADV_HUGEPAGE benefits from direct > compaction and is not hurt by not doing direct compaction. If you

Re: [PATCH 1/1] mm: thp: Redefine default THP defrag behaviour disable it by default

2016-02-25 Thread Andrea Arcangeli
On Fri, Feb 26, 2016 at 12:02:19AM +0100, Andrea Arcangeli wrote: > Let's first agree if direct compaction is going to hurt also for the > MADV_HUGEPAGE case. I say MADV_HUGEPAGE benefits from direct > compaction and is not hurt by not doing direct compaction. If you

Re: [PATCH 1/1] mm: thp: Redefine default THP defrag behaviour disable it by default

2016-02-25 Thread Andrea Arcangeli
On Thu, Feb 25, 2016 at 07:56:13PM +, Mel Gorman wrote: > Which is a specialised case that does not apply to all users. Remember > that the data showed that a basic streaming write of an anon mapping on > a freshly booted NUMA system was enough to stall the process for long > periods of time.

Re: [PATCH 1/1] mm: thp: Redefine default THP defrag behaviour disable it by default

2016-02-25 Thread Andrea Arcangeli
On Thu, Feb 25, 2016 at 07:56:13PM +, Mel Gorman wrote: > Which is a specialised case that does not apply to all users. Remember > that the data showed that a basic streaming write of an anon mapping on > a freshly booted NUMA system was enough to stall the process for long > periods of time.

Re: [PATCH 1/1] mm: thp: Redefine default THP defrag behaviour disable it by default

2016-02-25 Thread Andrea Arcangeli
On Thu, Feb 25, 2016 at 05:12:19PM +, Mel Gorman wrote: > some cases, this will reduce THP usage but the benefit of THP is hard to > measure and not a universal win where as a stall to reclaim/compaction is It depends on the workload: with virtual machines THP is essential from the start

Re: [PATCH 1/1] mm: thp: Redefine default THP defrag behaviour disable it by default

2016-02-25 Thread Andrea Arcangeli
On Thu, Feb 25, 2016 at 05:12:19PM +, Mel Gorman wrote: > some cases, this will reduce THP usage but the benefit of THP is hard to > measure and not a universal win where as a stall to reclaim/compaction is It depends on the workload: with virtual machines THP is essential from the start

Re: mm: BUG in expand_downwards

2016-01-28 Thread Andrea Arcangeli
hour. Does this help for the mm bug? >From 0cc410ae59800444ca929e3dc48e4f1580a95be6 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli Date: Thu, 28 Jan 2016 16:34:44 +0100 Subject: [PATCH 1/1] mm: validate_mm browse_rb SMP race condition The mmap_sem for reading in validate_mm called fr

Re: mm: BUG in expand_downwards

2016-01-28 Thread Andrea Arcangeli
hour. Does this help for the mm bug? >From 0cc410ae59800444ca929e3dc48e4f1580a95be6 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli <aarca...@redhat.com> Date: Thu, 28 Jan 2016 16:34:44 +0100 Subject: [PATCH 1/1] mm: validate_mm browse_rb SMP race condition The mmap_sem for reading

Re: [lkp] [ksm] 40e318e509: ltp.ksm01.fail

2016-01-25 Thread Andrea Arcangeli
Hello, On Mon, Jan 25, 2016 at 11:04:18AM +0800, kernel test robot wrote: > FYI, we noticed the below changes on > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master > commit 40e318e509d2c9f3cdb00ef32d2c14b9868af16b ("ksm: introduce > ksm_max_page_sharing per page

Re: [lkp] [ksm] 40e318e509: ltp.ksm01.fail

2016-01-25 Thread Andrea Arcangeli
Hello, On Mon, Jan 25, 2016 at 11:04:18AM +0800, kernel test robot wrote: > FYI, we noticed the below changes on > > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master > commit 40e318e509d2c9f3cdb00ef32d2c14b9868af16b ("ksm: introduce > ksm_max_page_sharing per page

Re: [PATCH 2/3] thp: change deferred_split_count() to return number of THP in queue

2016-01-22 Thread Andrea Arcangeli
On Thu, Jan 21, 2016 at 03:09:22PM +0300, Kirill A. Shutemov wrote: > @@ -3511,7 +3506,7 @@ static unsigned long deferred_split_scan(struct > shrinker *shrink, > list_splice_tail(, >split_queue); > spin_unlock_irqrestore(>split_queue_lock, flags); > > - return split *

Re: [PATCH 2/3] thp: change deferred_split_count() to return number of THP in queue

2016-01-22 Thread Andrea Arcangeli
On Thu, Jan 21, 2016 at 03:09:22PM +0300, Kirill A. Shutemov wrote: > @@ -3511,7 +3506,7 @@ static unsigned long deferred_split_scan(struct > shrinker *shrink, > list_splice_tail(, >split_queue); > spin_unlock_irqrestore(>split_queue_lock, flags); > > - return split *

Re: [PATCH 0/3] Couple of fixes for deferred_split_huge_page()

2016-01-21 Thread Andrea Arcangeli
ady in -mm! Reviewed-by: Andrea Arcangeli Great thanks, Andrea > > Kirill A. Shutemov (3): > thp: make split_queue per-node > thp: change deferred_split_count() to return number of THP in queue > thp: limit number of object to scan on deferred_split_scan() > > include/lin

Re: [PATCH 0/3] Couple of fixes for deferred_split_huge_page()

2016-01-21 Thread Andrea Arcangeli
ady in -mm! Reviewed-by: Andrea Arcangeli <aarca...@redhat.com> Great thanks, Andrea > > Kirill A. Shutemov (3): > thp: make split_queue per-node > thp: change deferred_split_count() to return number of THP in queue > thp: limit number of object to scan on deferred_spli

Re: [PATCHv12 34/37] thp: introduce deferred_split_huge_page()

2016-01-20 Thread Andrea Arcangeli
Hello Kirill, On Tue, Oct 06, 2015 at 06:24:01PM +0300, Kirill A. Shutemov wrote: > +static unsigned long deferred_split_scan(struct shrinker *shrink, > + struct shrink_control *sc) > +{ > + unsigned long flags; > + LIST_HEAD(list), *pos, *next; > + struct page *page; > +

Re: [PATCHv12 34/37] thp: introduce deferred_split_huge_page()

2016-01-20 Thread Andrea Arcangeli
Hello Kirill, On Tue, Oct 06, 2015 at 06:24:01PM +0300, Kirill A. Shutemov wrote: > +static unsigned long deferred_split_scan(struct shrinker *shrink, > + struct shrink_control *sc) > +{ > + unsigned long flags; > + LIST_HEAD(list), *pos, *next; > + struct page *page; > +

Re: [PATCH v3 0/2] Allow gmap fault to retry

2016-01-04 Thread Andrea Arcangeli
Thanks, > Dominik > > v2 -> v3: > - In case of retrying check vma again > - Do the accounting of major/minor faults once Reviewed-by: Andrea Arcangeli > > v1 -> v2: > - Instread of passing the VM_FAULT_RETRY from fixup_user_fault we do retries > within f

Re: [PATCH v3 0/2] Allow gmap fault to retry

2016-01-04 Thread Andrea Arcangeli
Thanks, > Dominik > > v2 -> v3: > - In case of retrying check vma again > - Do the accounting of major/minor faults once Reviewed-by: Andrea Arcangeli <aarca...@redhat.com> > > v1 -> v2: > - Instread of passing the VM_FAULT_RETRY from fixup_user_fault we do retrie

[PATCH 1/1] dmi_scan: uuid: fix endianess for smbios >= 0x206

2015-12-22 Thread Andrea Arcangeli
The dmi_ver wasn't updated correctly before the dmi_decode method run to save the uuid. That resulted in "dmidecode -s system-uuid" and /sys/class/dmi/id/product_uuid disagreeing. The latter was buggy and this fixes it. Reported-by: Federico Simoncelli Signed-off-by: Andrea

[PATCH 0/1] dmi_scan: uuid: fix endianess for smbios >= 0x206

2015-12-22 Thread Andrea Arcangeli
x20X instead of 0x20X00 as intended after commit 95be58df74a5b21e5a78e45fddb2fd59112524c5. Andrea Arcangeli (1): dmi_scan: uuid: fix endianess for smbios >= 0x206 drivers/firmware/dmi_scan.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- To unsubscribe from this list: send the line

[PATCH 0/1] dmi_scan: uuid: fix endianess for smbios >= 0x206

2015-12-22 Thread Andrea Arcangeli
x20X instead of 0x20X00 as intended after commit 95be58df74a5b21e5a78e45fddb2fd59112524c5. Andrea Arcangeli (1): dmi_scan: uuid: fix endianess for smbios >= 0x206 drivers/firmware/dmi_scan.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- To unsubscribe from this list: send the line

[PATCH 1/1] dmi_scan: uuid: fix endianess for smbios >= 0x206

2015-12-22 Thread Andrea Arcangeli
The dmi_ver wasn't updated correctly before the dmi_decode method run to save the uuid. That resulted in "dmidecode -s system-uuid" and /sys/class/dmi/id/product_uuid disagreeing. The latter was buggy and this fixes it. Reported-by: Federico Simoncelli <fsimo...@redhat.com> Signe

Re: [RFC] mm: change find_vma() function

2015-12-15 Thread Andrea Arcangeli
On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote: > > > On Dec 15, 2015, at 05:11, Kirill A. Shutemov wrote: > > > > On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote: > >> On 12/14, Kirill A. Shutemov wrote: > >>> > >>> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin

Re: [RFC] mm: change find_vma() function

2015-12-15 Thread Andrea Arcangeli
On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote: > > > On Dec 15, 2015, at 05:11, Kirill A. Shutemov wrote: > > > > On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote: > >> On 12/14, Kirill A. Shutemov wrote: > >>> > >>> On Mon, Dec 14, 2015 at

Re: [PATCH 1/2] mm: bring in additional flag for fixup_user_fault to signal unlock

2015-12-04 Thread Andrea Arcangeli
On Thu, Nov 26, 2015 at 06:27:01PM +0100, Dominik Dingel wrote: > @@ -599,6 +603,10 @@ int fixup_user_fault(struct task_struct *tsk, struct > mm_struct *mm, > if (!(vm_flags & vma->vm_flags)) > return -EFAULT; > > + if (unlocked) > + fault_flags |=

Re: [PATCH 01/23] userfaultfd: linux/Documentation/vm/userfaultfd.txt

2015-12-04 Thread Andrea Arcangeli
Hello Michael, On Fri, Dec 04, 2015 at 04:50:03PM +0100, Michael Kerrisk (man-pages) wrote: > Hi Andrea, > > On 09/11/2015 10:47 AM, Michael Kerrisk (man-pages) wrote: > > On 05/14/2015 07:30 PM, Andrea Arcangeli wrote: > >> Add documentation. > > > > Hi And

Re: [PATCH 01/23] userfaultfd: linux/Documentation/vm/userfaultfd.txt

2015-12-04 Thread Andrea Arcangeli
Hello Michael, On Fri, Dec 04, 2015 at 04:50:03PM +0100, Michael Kerrisk (man-pages) wrote: > Hi Andrea, > > On 09/11/2015 10:47 AM, Michael Kerrisk (man-pages) wrote: > > On 05/14/2015 07:30 PM, Andrea Arcangeli wrote: > >> Add documentation. > > > > Hi And

Re: [PATCH 1/2] mm: bring in additional flag for fixup_user_fault to signal unlock

2015-12-04 Thread Andrea Arcangeli
On Thu, Nov 26, 2015 at 06:27:01PM +0100, Dominik Dingel wrote: > @@ -599,6 +603,10 @@ int fixup_user_fault(struct task_struct *tsk, struct > mm_struct *mm, > if (!(vm_flags & vma->vm_flags)) > return -EFAULT; > > + if (unlocked) > + fault_flags |=

Re: [PATCH 00/11] KVM: x86: track guest page access

2015-12-01 Thread Andrea Arcangeli
handler for the tracked pages. The > > performance result of kernel building is as followings: > > > >before after > > real 461.63 real 455.48 > > user 4529.55 user 4557.88 > > sys 1995.39 sys 1922.57 > > For KVM-GT, as far a

Re: [PATCH 00/11] KVM: x86: track guest page access

2015-12-01 Thread Andrea Arcangeli
handler for the tracked pages. The > > performance result of kernel building is as followings: > > > >before after > > real 461.63 real 455.48 > > user 4529.55 user 4557.88 > > sys 1995.39 sys 1922.57 > > For KVM-GT, as far a

Re: [PATCH 02/37] mm, frame_vector: do not use get_user_pages_locked()

2015-11-18 Thread Andrea Arcangeli
On Wed, Nov 18, 2015 at 01:29:38PM +0100, Jan Kara wrote: > On Mon 16-11-15 19:35:14, Dave Hansen wrote: > > > > From: Dave Hansen > > > > get_user_pages_locked() appears to be for use when a caller needs > > to know that its lock on mmap_sem was invalidated by the gup > > call. > > > > But,

Re: [PATCH 02/37] mm, frame_vector: do not use get_user_pages_locked()

2015-11-18 Thread Andrea Arcangeli
On Wed, Nov 18, 2015 at 01:29:38PM +0100, Jan Kara wrote: > On Mon 16-11-15 19:35:14, Dave Hansen wrote: > > > > From: Dave Hansen > > > > get_user_pages_locked() appears to be for use when a caller needs > > to know that its lock on mmap_sem was invalidated by the

Re: [PATCH 14/23] userfaultfd: wake pending userfaults

2015-10-22 Thread Andrea Arcangeli
On Thu, Oct 22, 2015 at 05:15:09PM +0200, Peter Zijlstra wrote: > Indefinitely is such a long time, we should try and finish > computation before the computer dies etc. :-) Indefinitely as read_seqcount_retry, eventually it makes progress. Even returning 0 from the page fault can trigger it

Re: [PATCH 14/23] userfaultfd: wake pending userfaults

2015-10-22 Thread Andrea Arcangeli
On Thu, Oct 22, 2015 at 03:38:24PM +0200, Peter Zijlstra wrote: > On Thu, Oct 22, 2015 at 03:20:15PM +0200, Andrea Arcangeli wrote: > > > If schedule spontaneously wakes up a task in TASK_KILLABLE state that > > would be a bug in the scheduler in my view. Luckily

Re: [PATCH 14/23] userfaultfd: wake pending userfaults

2015-10-22 Thread Andrea Arcangeli
On Thu, Oct 22, 2015 at 02:10:56PM +0200, Peter Zijlstra wrote: > On Thu, May 14, 2015 at 07:31:11PM +0200, Andrea Arcangeli wrote: > > @@ -255,21 +259,23 @@ int handle_userfault(struct vm_area_struct *vma, > > unsigned long address, > >

Re: [PATCH 14/23] userfaultfd: wake pending userfaults

2015-10-22 Thread Andrea Arcangeli
On Thu, Oct 22, 2015 at 03:38:24PM +0200, Peter Zijlstra wrote: > On Thu, Oct 22, 2015 at 03:20:15PM +0200, Andrea Arcangeli wrote: > > > If schedule spontaneously wakes up a task in TASK_KILLABLE state that > > would be a bug in the scheduler in my view. Luckily

Re: [PATCH 14/23] userfaultfd: wake pending userfaults

2015-10-22 Thread Andrea Arcangeli
On Thu, Oct 22, 2015 at 02:10:56PM +0200, Peter Zijlstra wrote: > On Thu, May 14, 2015 at 07:31:11PM +0200, Andrea Arcangeli wrote: > > @@ -255,21 +259,23 @@ int handle_userfault(struct vm_area_struct *vma, > > unsigned long address, > >

Re: [PATCH 14/23] userfaultfd: wake pending userfaults

2015-10-22 Thread Andrea Arcangeli
On Thu, Oct 22, 2015 at 05:15:09PM +0200, Peter Zijlstra wrote: > Indefinitely is such a long time, we should try and finish > computation before the computer dies etc. :-) Indefinitely as read_seqcount_retry, eventually it makes progress. Even returning 0 from the page fault can trigger it

Re: [PATCH 0/7] userfault21 update

2015-10-19 Thread Andrea Arcangeli
Hello Patrick, On Mon, Oct 12, 2015 at 11:04:11AM -0400, Patrick Donnelly wrote: > Hello Andrea, > > On Mon, Jun 15, 2015 at 1:22 PM, Andrea Arcangeli wrote: > > This is an incremental update to the userfaultfd code in -mm. > > Sorry I'm late to this party. I'm curious

Re: [PATCH 0/7] userfault21 update

2015-10-19 Thread Andrea Arcangeli
Hello Patrick, On Mon, Oct 12, 2015 at 11:04:11AM -0400, Patrick Donnelly wrote: > Hello Andrea, > > On Mon, Jun 15, 2015 at 1:22 PM, Andrea Arcangeli <aarca...@redhat.com> wrote: > > This is an incremental update to the userfaultfd code in -mm. > > Sorry I'm late to

Re: [PATCH] thp: use is_zero_pfn after pte_present check

2015-10-12 Thread Andrea Arcangeli
separate patch. Reviewed-by: Andrea Arcangeli Thanks, Andrea -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] thp: use is_zero_pfn after pte_present check

2015-10-12 Thread Andrea Arcangeli
separate patch. Reviewed-by: Andrea Arcangeli <aarca...@redhat.com> Thanks, Andrea -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Ple

Re: [PATCH] mm/mmap.c: Remove redundant vma looping

2015-10-04 Thread Andrea Arcangeli
Hello Chen, On Sun, Oct 04, 2015 at 12:55:29PM +0800, Chen Gang wrote: > Theoretically, the lock and unlock need to be symmetric, if we have to > lock f_mapping all firstly, then lock all anon_vma, probably, we also > need to unlock anon_vma all, then unlock all f_mapping. They don't need to be

Re: [PATCH] mm/mmap.c: Remove redundant vma looping

2015-10-04 Thread Andrea Arcangeli
Hello Chen, On Sun, Oct 04, 2015 at 12:55:29PM +0800, Chen Gang wrote: > Theoretically, the lock and unlock need to be symmetric, if we have to > lock f_mapping all firstly, then lock all anon_vma, probably, we also > need to unlock anon_vma all, then unlock all f_mapping. They don't need to be

[PATCH] thermal: avoid division by zero in power allocator

2015-09-28 Thread Andrea Arcangeli
During boot I get a div by zero Oops regression starting in v4.3-rc3. Reviewed-by: Javi Merino Signed-off-by: Andrea Arcangeli --- drivers/thermal/power_allocator.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/thermal/power_allocator.c b/drivers/thermal

[PATCH] thermal: avoid division by zero in power allocator

2015-09-28 Thread Andrea Arcangeli
Hello, This is needed for my workstations or they Oops at boot since v4.3-rc3. Andrea Arcangeli (1): thermal: avoid division by zero in power allocator drivers/thermal/power_allocator.c | 10 ++ 1 file changed, 10 insertions(+) Thanks, Andrea -- To unsubscribe from this list: send

[PATCH] thermal: avoid division by zero in power allocator

2015-09-28 Thread Andrea Arcangeli
During boot I get a div by zero Oops regression starting in v4.3-rc3. Reviewed-by: Javi Merino <javi.mer...@arm.com> Signed-off-by: Andrea Arcangeli <aarca...@redhat.com> --- drivers/thermal/power_allocator.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/dri

[PATCH] thermal: avoid division by zero in power allocator

2015-09-28 Thread Andrea Arcangeli
Hello, This is needed for my workstations or they Oops at boot since v4.3-rc3. Andrea Arcangeli (1): thermal: avoid division by zero in power allocator drivers/thermal/power_allocator.c | 10 ++ 1 file changed, 10 insertions(+) Thanks, Andrea -- To unsubscribe from this list: send

Re: [RFC] futex: prevent endless loop on s390x with emulated hugepages

2015-09-24 Thread Andrea Arcangeli
On Thu, Sep 24, 2015 at 05:05:48PM +0200, Vlastimil Babka wrote: > The problem is an endless loop in get_futex_key() when > CONFIG_TRANSPARENT_HUGEPAGE is enabled and the s390x machine has emulated > hugepages. The code tries to serialize against __split_huge_page_splitting(), > but

Re: [RFC] futex: prevent endless loop on s390x with emulated hugepages

2015-09-24 Thread Andrea Arcangeli
On Thu, Sep 24, 2015 at 05:05:48PM +0200, Vlastimil Babka wrote: > The problem is an endless loop in get_futex_key() when > CONFIG_TRANSPARENT_HUGEPAGE is enabled and the s390x machine has emulated > hugepages. The code tries to serialize against __split_huge_page_splitting(), > but

Re: [PATCH 2/2] selftests/userfaultfd: improve syscall number definition

2015-09-22 Thread Andrea Arcangeli
On Tue, Sep 22, 2015 at 07:49:13AM -0600, Shuah Khan wrote: > On 09/22/2015 04:45 AM, Andre Przywara wrote: > > At the moment the userfaultfd test program only supports x86 and an > > architecture called "powewrpc" ;-) > > Fix that typo and add the syscall numbers for other architectures as > >

Re: [PATCH 2/2] selftests/userfaultfd: improve syscall number definition

2015-09-22 Thread Andrea Arcangeli
On Tue, Sep 22, 2015 at 07:49:13AM -0600, Shuah Khan wrote: > On 09/22/2015 04:45 AM, Andre Przywara wrote: > > At the moment the userfaultfd test program only supports x86 and an > > architecture called "powewrpc" ;-) > > Fix that typo and add the syscall numbers for other architectures as > >

Re: [PATCH] userfaultfd: add missing mmput() in error path

2015-09-14 Thread Andrea Arcangeli
Hello Eric, On Sun, Sep 13, 2015 at 06:57:27PM -0500, Eric Biggers wrote: > Signed-off-by: Eric Biggers > --- > fs/userfaultfd.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 634e676..f9aeb40 100644 > ---

Re: [PATCH] userfaultfd: add missing mmput() in error path

2015-09-14 Thread Andrea Arcangeli
Hello Eric, On Sun, Sep 13, 2015 at 06:57:27PM -0500, Eric Biggers wrote: > Signed-off-by: Eric Biggers > --- > fs/userfaultfd.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 634e676..f9aeb40 100644

Re: [PATCH 6/7] selftests: only compile userfaultfd for x86 and powperpc

2015-09-09 Thread Andrea Arcangeli
On Wed, Sep 09, 2015 at 06:43:11PM +1000, Michael Ellerman wrote: > On Tue, 2015-09-08 at 16:34 +0200, Andrea Arcangeli wrote: > > > > I already had a few minor changes queued to be submitted for arm and > > ppc and a few updates to the selftest. > > > > I did

Re: [PATCH 6/7] selftests: only compile userfaultfd for x86 and powperpc

2015-09-09 Thread Andrea Arcangeli
On Wed, Sep 09, 2015 at 06:43:11PM +1000, Michael Ellerman wrote: > On Tue, 2015-09-08 at 16:34 +0200, Andrea Arcangeli wrote: > > > > I already had a few minor changes queued to be submitted for arm and > > ppc and a few updates to the selftest. > > > > I did

Re: [PATCH 6/7] selftests: only compile userfaultfd for x86 and powperpc

2015-09-08 Thread Andrea Arcangeli
On Tue, Sep 08, 2015 at 08:25:45PM +0800, Bamvor Zhang Jian wrote: > Hi, Michael > > On 09/08/2015 05:54 PM, Michael Ellerman wrote: > > On Tue, 2015-09-08 at 17:15 +0800, Bamvor Zhang Jian wrote: > >> Hi, Michael > >> > >> I thought I reply to you, but ... > >> > >> On 08/31/2015 11:26 AM,

Re: [PATCH 6/7] selftests: only compile userfaultfd for x86 and powperpc

2015-09-08 Thread Andrea Arcangeli
On Tue, Sep 08, 2015 at 08:25:45PM +0800, Bamvor Zhang Jian wrote: > Hi, Michael > > On 09/08/2015 05:54 PM, Michael Ellerman wrote: > > On Tue, 2015-09-08 at 17:15 +0800, Bamvor Zhang Jian wrote: > >> Hi, Michael > >> > >> I thought I reply to you, but ... > >> > >> On 08/31/2015 11:26 AM,

Re: [PATCHv5 0/7] Fix compound_head() race

2015-09-04 Thread Andrea Arcangeli
or and compound_order into one word in struct page > mm: make compound_head() robust > mm: use 'unsigned int' for page order > mm: use 'unsigned int' for compound_dtor/compound_order on 64BIT Reviewed-by: Andrea Arcangeli The only other alternative solution that doesn't require f

Re: [PATCHv5 0/7] Fix compound_head() race

2015-09-04 Thread Andrea Arcangeli
or and compound_order into one word in struct page > mm: make compound_head() robust > mm: use 'unsigned int' for page order > mm: use 'unsigned int' for compound_dtor/compound_order on 64BIT Reviewed-by: Andrea Arcangeli <aarca...@redhat.com> The only other alternative sol

Re: [Qemu-devel] [PATCH 19/23] userfaultfd: activate syscall

2015-08-11 Thread Andrea Arcangeli
e > > > -#define __NR_syscalls364 > +#define __NR_syscalls365 > > #define __NR__exit __NR_exit > #define NR_syscalls __NR_syscalls Reviewed-by: Andrea Arcangeli -- To unsubscribe from this list: send the line "unsubscribe linux-ker

Re: [Qemu-devel] [PATCH 19/23] userfaultfd: activate syscall

2015-08-11 Thread Andrea Arcangeli
364 +#define __NR_syscalls365 #define __NR__exit __NR_exit #define NR_syscalls __NR_syscalls Reviewed-by: Andrea Arcangeli aarca...@redhat.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord

Re: linux-next: manual merge of the akpm-current tree with the tip tree

2015-07-30 Thread Andrea Arcangeli
On Wed, Jul 29, 2015 at 08:46:10PM +0200, Thomas Gleixner wrote: > On Wed, 29 Jul 2015, Andy Lutomirski wrote: > > -tip people: want to assign Andrea a pair of syscall numbers? > > Sure, just send a patch Awesome, I just sent the patch to register the syscall against -tip with the usual

[PATCH] userfaultfd: register syscall numbers for x86 32bit and x86-64 64bit

2015-07-30 Thread Andrea Arcangeli
This registers the official numbers of the userfaultfd syscall for x86 32bit and x86-64 64bit. This registration allows to ship kernels in production using these two syscall numbers for userfaultfd. Acked-by: Pavel Emelyanov Signed-off-by: Andrea Arcangeli --- arch/x86/entry/syscalls

Re: linux-next: manual merge of the akpm-current tree with the tip tree

2015-07-30 Thread Andrea Arcangeli
On Wed, Jul 29, 2015 at 08:46:10PM +0200, Thomas Gleixner wrote: On Wed, 29 Jul 2015, Andy Lutomirski wrote: -tip people: want to assign Andrea a pair of syscall numbers? Sure, just send a patch Awesome, I just sent the patch to register the syscall against -tip with the usual

[PATCH] userfaultfd: register syscall numbers for x86 32bit and x86-64 64bit

2015-07-30 Thread Andrea Arcangeli
This registers the official numbers of the userfaultfd syscall for x86 32bit and x86-64 64bit. This registration allows to ship kernels in production using these two syscall numbers for userfaultfd. Acked-by: Pavel Emelyanov xe...@parallels.com Signed-off-by: Andrea Arcangeli aarca...@redhat.com

Re: linux-next: manual merge of the akpm-current tree with the tip tree

2015-07-29 Thread Andrea Arcangeli
ssible never mind. Thanks, Andrea === >From 873093c32b4b1d0b6c3f18ec1e52b56c24f67457 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli Date: Wed, 29 Jul 2015 18:53:17 +0200 Subject: [PATCH] userfaultfd: selftest: update userfaultfd x86 32bit syscall number It changed as result of linux-n

Re: linux-next: manual merge of the akpm-current tree with the tip tree

2015-07-29 Thread Andrea Arcangeli
never mind. Thanks, Andrea === From 873093c32b4b1d0b6c3f18ec1e52b56c24f67457 Mon Sep 17 00:00:00 2001 From: Andrea Arcangeli aarca...@redhat.com Date: Wed, 29 Jul 2015 18:53:17 +0200 Subject: [PATCH] userfaultfd: selftest: update userfaultfd x86 32bit syscall number It changed as result of linux

Re: [PATCH] mm: Flush the TLB for a single address in a huge page

2015-07-23 Thread Andrea Arcangeli
On Thu, Jul 23, 2015 at 09:55:33AM -0700, Dave Hansen wrote: > On 07/23/2015 09:16 AM, Catalin Marinas wrote: > > Anyway, if you want to keep the option of a full TLB flush for x86 on > > huge pages, I'm happy to repost a v2 with a separate > > flush_tlb_pmd_huge_page that arch code can define as

Re: [PATCH] mm: Flush the TLB for a single address in a huge page

2015-07-23 Thread Andrea Arcangeli
On Thu, Jul 23, 2015 at 07:41:24AM -0700, Dave Hansen wrote: > We had a discussion about this a few weeks ago: > > https://lkml.org/lkml/2015/6/25/666 > > The argument is that the CPU is so good at refilling the TLB that it > rarely waits on it, so the "cost" can be very very low. That

Re: [PATCH] mm: Flush the TLB for a single address in a huge page

2015-07-23 Thread Andrea Arcangeli
On Thu, Jul 23, 2015 at 11:49:38AM +0100, Catalin Marinas wrote: > On Thu, Jul 23, 2015 at 12:05:21AM +0100, Dave Hansen wrote: > > On 07/22/2015 03:48 PM, Catalin Marinas wrote: > > > You are right, on x86 the tlb_single_page_flush_ceiling seems to be > > > 33, so for an HPAGE_SIZE range the code

Re: [PATCH] mm: Flush the TLB for a single address in a huge page

2015-07-23 Thread Andrea Arcangeli
On Thu, Jul 23, 2015 at 11:49:38AM +0100, Catalin Marinas wrote: On Thu, Jul 23, 2015 at 12:05:21AM +0100, Dave Hansen wrote: On 07/22/2015 03:48 PM, Catalin Marinas wrote: You are right, on x86 the tlb_single_page_flush_ceiling seems to be 33, so for an HPAGE_SIZE range the code does a

Re: [PATCH] mm: Flush the TLB for a single address in a huge page

2015-07-23 Thread Andrea Arcangeli
On Thu, Jul 23, 2015 at 07:41:24AM -0700, Dave Hansen wrote: We had a discussion about this a few weeks ago: https://lkml.org/lkml/2015/6/25/666 The argument is that the CPU is so good at refilling the TLB that it rarely waits on it, so the cost can be very very low. That was about

Re: [PATCH] mm: Flush the TLB for a single address in a huge page

2015-07-23 Thread Andrea Arcangeli
On Thu, Jul 23, 2015 at 09:55:33AM -0700, Dave Hansen wrote: On 07/23/2015 09:16 AM, Catalin Marinas wrote: Anyway, if you want to keep the option of a full TLB flush for x86 on huge pages, I'm happy to repost a v2 with a separate flush_tlb_pmd_huge_page that arch code can define as it sees

Re: [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization

2015-06-23 Thread Andrea Arcangeli
Hi Dave, On Tue, Jun 23, 2015 at 12:00:19PM -0700, Dave Hansen wrote: > Down in userfaultfd_wake_function(), it looks like you intended for a > len=0 to mean "wake all". But the validate_range() that we do from > userspace has a !len check in it, which keeps us from passing a len=0 in > from

Re: [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization

2015-06-23 Thread Andrea Arcangeli
Hi Dave, On Tue, Jun 23, 2015 at 12:00:19PM -0700, Dave Hansen wrote: Down in userfaultfd_wake_function(), it looks like you intended for a len=0 to mean wake all. But the validate_range() that we do from userspace has a !len check in it, which keeps us from passing a len=0 in from

Re: [PATCH 5/7] userfaultfd: switch to exclusive wakeup for blocking reads

2015-06-16 Thread Andrea Arcangeli
On Mon, Jun 15, 2015 at 08:41:24PM -1000, Linus Torvalds wrote: > On Mon, Jun 15, 2015 at 12:19 PM, Andrea Arcangeli > wrote: > > > > Yes, it would leave the other blocked, how is it different from having > > just 1 reader and it gets killed? > > Either is complet

Re: [PATCH 5/7] userfaultfd: switch to exclusive wakeup for blocking reads

2015-06-16 Thread Andrea Arcangeli
On Mon, Jun 15, 2015 at 08:41:24PM -1000, Linus Torvalds wrote: On Mon, Jun 15, 2015 at 12:19 PM, Andrea Arcangeli aarca...@redhat.com wrote: Yes, it would leave the other blocked, how is it different from having just 1 reader and it gets killed? Either is completely wrong

Re: [PATCH 5/7] userfaultfd: switch to exclusive wakeup for blocking reads

2015-06-15 Thread Andrea Arcangeli
On Mon, Jun 15, 2015 at 08:19:07AM -1000, Linus Torvalds wrote: > What if the process doing the polling never doors anything with the end > result? Maybe it meant to, but it got killed before it could? Are you going > to leave everybody else blocked, even though there are pending events? Yes, it

Re: [PATCH 1/7] userfaultfd: require UFFDIO_API before other ioctls

2015-06-15 Thread Andrea Arcangeli
On Mon, Jun 15, 2015 at 08:11:50AM -1000, Linus Torvalds wrote: > On Jun 15, 2015 7:22 AM, "Andrea Arcangeli" wrote: > > > > + if (cmd != UFFDIO_API) { > > + if (ctx->state == UFFD_STATE_WAIT_API) > > + return

[PATCH 5/7] userfaultfd: switch to exclusive wakeup for blocking reads

2015-06-15 Thread Andrea Arcangeli
# 256.660 M/sec ( +- 0.71% ) [83.69%] 59,203,898 branch-misses #0.51% of all branches ( +- 2.03% ) [83.54%] 2.600912438 seconds time elapsed ( +- 0.02% ) Signed-off-by: Andrea Arcangeli

[PATCH 2/7] userfaultfd: propagate the full address in THP faults

2015-06-15 Thread Andrea Arcangeli
failure because the wrong page was being copied. For various reasons this wasn't easily reproducible in the qemu workload, but the strestest exposed the problem immediately. Signed-off-by: Andrea Arcangeli --- mm/huge_memory.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff

[PATCH 6/7] userfaultfd: Revert "userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key"

2015-06-15 Thread Andrea Arcangeli
all, has wait->flags WQ_FLAG_EXCLUSIVE set. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 8 include/linux/wait.h | 5 ++--- kernel/sched/wait.c | 7 +++ net/sunrpc/sched.c | 2 +- 4 files changed, 10 insertions(+), 12 deletions(-) diff --git a/fs/userfaultfd.c b

[PATCH 7/7] userfaultfd: selftest

2015-06-15 Thread Andrea Arcangeli
by userfaultfd. The fix for those two bugs was also strightforward and required no design change of any sort. Signed-off-by: Andrea Arcangeli --- tools/testing/selftests/vm/Makefile | 4 +- tools/testing/selftests/vm/userfaultfd.c | 669 +++ 2 files changed

[PATCH 4/7] userfaultfd: avoid missing wakeups during refile in userfaultfd_read

2015-06-15 Thread Andrea Arcangeli
During the refile in userfaultfd_read both waitqueues could look empty to the lockless wake_userfault(). Use a seqcount to prevent this false negative that could leave an userfault blocked. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 26 -- 1 file changed, 24

[PATCH 1/7] userfaultfd: require UFFDIO_API before other ioctls

2015-06-15 Thread Andrea Arcangeli
(all but UFFDIO_API/struct uffdio_api) with a bump of uffdio_api.api. There's no actual plan or need to change the API or the ioctl, the current API already should cover fine even the non cooperative usage, but this is just for the longer term future just in case. Signed-off-by: Andrea Arcangeli

[PATCH 3/7] userfaultfd: allow signals to interrupt a userfault

2015-06-15 Thread Andrea Arcangeli
nal processed, coredumps always worked perfectly with userfaults, no matter if the userfault is triggered by GUP a kernel copy_user or directly from userland. Signed-off-by: Andrea Arcangeli --- fs/userfaultfd.c | 35 --- 1 file changed, 32 insertions(+), 3 deleti

[PATCH 0/7] userfault21 update

2015-06-15 Thread Andrea Arcangeli
ruled out either yet. Andrea Arcangeli (7): userfaultfd: require UFFDIO_API before other ioctls userfaultfd: propagate the full address in THP faults userfaultfd: allow signals to interrupt a userfault userfaultfd: avoid missing wakeups during refile in userfaultfd_read userfaultfd:

[PATCH 0/7] userfault21 update

2015-06-15 Thread Andrea Arcangeli
. Andrea Arcangeli (7): userfaultfd: require UFFDIO_API before other ioctls userfaultfd: propagate the full address in THP faults userfaultfd: allow signals to interrupt a userfault userfaultfd: avoid missing wakeups during refile in userfaultfd_read userfaultfd: switch to exclusive wakeup

[PATCH 3/7] userfaultfd: allow signals to interrupt a userfault

2015-06-15 Thread Andrea Arcangeli
signal processed, coredumps always worked perfectly with userfaults, no matter if the userfault is triggered by GUP a kernel copy_user or directly from userland. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/userfaultfd.c | 35 --- 1 file changed, 32

[PATCH 1/7] userfaultfd: require UFFDIO_API before other ioctls

2015-06-15 Thread Andrea Arcangeli
(all but UFFDIO_API/struct uffdio_api) with a bump of uffdio_api.api. There's no actual plan or need to change the API or the ioctl, the current API already should cover fine even the non cooperative usage, but this is just for the longer term future just in case. Signed-off-by: Andrea Arcangeli

[PATCH 4/7] userfaultfd: avoid missing wakeups during refile in userfaultfd_read

2015-06-15 Thread Andrea Arcangeli
During the refile in userfaultfd_read both waitqueues could look empty to the lockless wake_userfault(). Use a seqcount to prevent this false negative that could leave an userfault blocked. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/userfaultfd.c | 26

[PATCH 7/7] userfaultfd: selftest

2015-06-15 Thread Andrea Arcangeli
by userfaultfd. The fix for those two bugs was also strightforward and required no design change of any sort. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- tools/testing/selftests/vm/Makefile | 4 +- tools/testing/selftests/vm/userfaultfd.c | 669

[PATCH 5/7] userfaultfd: switch to exclusive wakeup for blocking reads

2015-06-15 Thread Andrea Arcangeli
# 256.660 M/sec ( +- 0.71% ) [83.69%] 59,203,898 branch-misses #0.51% of all branches ( +- 2.03% ) [83.54%] 2.600912438 seconds time elapsed ( +- 0.02% ) Signed-off-by: Andrea Arcangeli

[PATCH 6/7] userfaultfd: Revert userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key

2015-06-15 Thread Andrea Arcangeli
WQ_FLAG_EXCLUSIVE set. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- fs/userfaultfd.c | 8 include/linux/wait.h | 5 ++--- kernel/sched/wait.c | 7 +++ net/sunrpc/sched.c | 2 +- 4 files changed, 10 insertions(+), 12 deletions(-) diff --git a/fs/userfaultfd.c b/fs

[PATCH 2/7] userfaultfd: propagate the full address in THP faults

2015-06-15 Thread Andrea Arcangeli
failure because the wrong page was being copied. For various reasons this wasn't easily reproducible in the qemu workload, but the strestest exposed the problem immediately. Signed-off-by: Andrea Arcangeli aarca...@redhat.com --- mm/huge_memory.c | 10 ++ 1 file changed, 6 insertions(+), 4

Re: [PATCH 1/7] userfaultfd: require UFFDIO_API before other ioctls

2015-06-15 Thread Andrea Arcangeli
On Mon, Jun 15, 2015 at 08:11:50AM -1000, Linus Torvalds wrote: On Jun 15, 2015 7:22 AM, Andrea Arcangeli aarca...@redhat.com wrote: + if (cmd != UFFDIO_API) { + if (ctx-state == UFFD_STATE_WAIT_API) + return -EINVAL; + BUG_ON(ctx

<    4   5   6   7   8   9   10   11   12   13   >