Hello,
On Wed, Mar 02, 2016 at 12:48:46AM +, Al Viro wrote:
> On Tue, Mar 01, 2016 at 12:06:49PM -0800, Linus Torvalds wrote:
>
> > So the only access we really care about is the child tid-pointer
> > clearing one, and that always happens after PF_EXITING has been set
> > afaik.
> >
> > No
Hello Mel,
On Fri, Feb 26, 2016 at 11:13:16AM +, Mel Gorman wrote:
> 1. By default, "madvise" and direct reclaim/compaction for applications
>that specifically requested that behaviour. This will avoid breaking
>MADV_HUGEPAGE which you mentioned in a few places
Defragging memory
Hello Mel,
On Fri, Feb 26, 2016 at 11:13:16AM +, Mel Gorman wrote:
> 1. By default, "madvise" and direct reclaim/compaction for applications
>that specifically requested that behaviour. This will avoid breaking
>MADV_HUGEPAGE which you mentioned in a few places
Defragging memory
On Fri, Feb 26, 2016 at 12:02:19AM +0100, Andrea Arcangeli wrote:
> Let's first agree if direct compaction is going to hurt also for the
> MADV_HUGEPAGE case. I say MADV_HUGEPAGE benefits from direct
> compaction and is not hurt by not doing direct compaction. If you
On Fri, Feb 26, 2016 at 12:02:19AM +0100, Andrea Arcangeli wrote:
> Let's first agree if direct compaction is going to hurt also for the
> MADV_HUGEPAGE case. I say MADV_HUGEPAGE benefits from direct
> compaction and is not hurt by not doing direct compaction. If you
On Thu, Feb 25, 2016 at 07:56:13PM +, Mel Gorman wrote:
> Which is a specialised case that does not apply to all users. Remember
> that the data showed that a basic streaming write of an anon mapping on
> a freshly booted NUMA system was enough to stall the process for long
> periods of time.
On Thu, Feb 25, 2016 at 07:56:13PM +, Mel Gorman wrote:
> Which is a specialised case that does not apply to all users. Remember
> that the data showed that a basic streaming write of an anon mapping on
> a freshly booted NUMA system was enough to stall the process for long
> periods of time.
On Thu, Feb 25, 2016 at 05:12:19PM +, Mel Gorman wrote:
> some cases, this will reduce THP usage but the benefit of THP is hard to
> measure and not a universal win where as a stall to reclaim/compaction is
It depends on the workload: with virtual machines THP is essential
from the start
On Thu, Feb 25, 2016 at 05:12:19PM +, Mel Gorman wrote:
> some cases, this will reduce THP usage but the benefit of THP is hard to
> measure and not a universal win where as a stall to reclaim/compaction is
It depends on the workload: with virtual machines THP is essential
from the start
hour.
Does this help for the mm bug?
>From 0cc410ae59800444ca929e3dc48e4f1580a95be6 Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli
Date: Thu, 28 Jan 2016 16:34:44 +0100
Subject: [PATCH 1/1] mm: validate_mm browse_rb SMP race condition
The mmap_sem for reading in validate_mm called fr
hour.
Does this help for the mm bug?
>From 0cc410ae59800444ca929e3dc48e4f1580a95be6 Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli <aarca...@redhat.com>
Date: Thu, 28 Jan 2016 16:34:44 +0100
Subject: [PATCH 1/1] mm: validate_mm browse_rb SMP race condition
The mmap_sem for reading
Hello,
On Mon, Jan 25, 2016 at 11:04:18AM +0800, kernel test robot wrote:
> FYI, we noticed the below changes on
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> commit 40e318e509d2c9f3cdb00ef32d2c14b9868af16b ("ksm: introduce
> ksm_max_page_sharing per page
Hello,
On Mon, Jan 25, 2016 at 11:04:18AM +0800, kernel test robot wrote:
> FYI, we noticed the below changes on
>
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
> commit 40e318e509d2c9f3cdb00ef32d2c14b9868af16b ("ksm: introduce
> ksm_max_page_sharing per page
On Thu, Jan 21, 2016 at 03:09:22PM +0300, Kirill A. Shutemov wrote:
> @@ -3511,7 +3506,7 @@ static unsigned long deferred_split_scan(struct
> shrinker *shrink,
> list_splice_tail(, >split_queue);
> spin_unlock_irqrestore(>split_queue_lock, flags);
>
> - return split *
On Thu, Jan 21, 2016 at 03:09:22PM +0300, Kirill A. Shutemov wrote:
> @@ -3511,7 +3506,7 @@ static unsigned long deferred_split_scan(struct
> shrinker *shrink,
> list_splice_tail(, >split_queue);
> spin_unlock_irqrestore(>split_queue_lock, flags);
>
> - return split *
ady in -mm!
Reviewed-by: Andrea Arcangeli
Great thanks,
Andrea
>
> Kirill A. Shutemov (3):
> thp: make split_queue per-node
> thp: change deferred_split_count() to return number of THP in queue
> thp: limit number of object to scan on deferred_split_scan()
>
> include/lin
ady in -mm!
Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>
Great thanks,
Andrea
>
> Kirill A. Shutemov (3):
> thp: make split_queue per-node
> thp: change deferred_split_count() to return number of THP in queue
> thp: limit number of object to scan on deferred_spli
Hello Kirill,
On Tue, Oct 06, 2015 at 06:24:01PM +0300, Kirill A. Shutemov wrote:
> +static unsigned long deferred_split_scan(struct shrinker *shrink,
> + struct shrink_control *sc)
> +{
> + unsigned long flags;
> + LIST_HEAD(list), *pos, *next;
> + struct page *page;
> +
Hello Kirill,
On Tue, Oct 06, 2015 at 06:24:01PM +0300, Kirill A. Shutemov wrote:
> +static unsigned long deferred_split_scan(struct shrinker *shrink,
> + struct shrink_control *sc)
> +{
> + unsigned long flags;
> + LIST_HEAD(list), *pos, *next;
> + struct page *page;
> +
Thanks,
> Dominik
>
> v2 -> v3:
> - In case of retrying check vma again
> - Do the accounting of major/minor faults once
Reviewed-by: Andrea Arcangeli
>
> v1 -> v2:
> - Instread of passing the VM_FAULT_RETRY from fixup_user_fault we do retries
> within f
Thanks,
> Dominik
>
> v2 -> v3:
> - In case of retrying check vma again
> - Do the accounting of major/minor faults once
Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>
>
> v1 -> v2:
> - Instread of passing the VM_FAULT_RETRY from fixup_user_fault we do retrie
The dmi_ver wasn't updated correctly before the dmi_decode method run
to save the uuid.
That resulted in "dmidecode -s system-uuid" and
/sys/class/dmi/id/product_uuid disagreeing. The latter was buggy and
this fixes it.
Reported-by: Federico Simoncelli
Signed-off-by: Andrea
x20X instead of 0x20X00 as intended
after commit 95be58df74a5b21e5a78e45fddb2fd59112524c5.
Andrea Arcangeli (1):
dmi_scan: uuid: fix endianess for smbios >= 0x206
drivers/firmware/dmi_scan.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--
To unsubscribe from this list: send the line
x20X instead of 0x20X00 as intended
after commit 95be58df74a5b21e5a78e45fddb2fd59112524c5.
Andrea Arcangeli (1):
dmi_scan: uuid: fix endianess for smbios >= 0x206
drivers/firmware/dmi_scan.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--
To unsubscribe from this list: send the line
The dmi_ver wasn't updated correctly before the dmi_decode method run
to save the uuid.
That resulted in "dmidecode -s system-uuid" and
/sys/class/dmi/id/product_uuid disagreeing. The latter was buggy and
this fixes it.
Reported-by: Federico Simoncelli <fsimo...@redhat.com>
Signe
On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote:
>
> > On Dec 15, 2015, at 05:11, Kirill A. Shutemov wrote:
> >
> > On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote:
> >> On 12/14, Kirill A. Shutemov wrote:
> >>>
> >>> On Mon, Dec 14, 2015 at 07:02:25PM +0800, yalin
On Tue, Dec 15, 2015 at 02:41:21PM +0800, yalin wang wrote:
>
> > On Dec 15, 2015, at 05:11, Kirill A. Shutemov wrote:
> >
> > On Mon, Dec 14, 2015 at 06:55:09PM +0100, Oleg Nesterov wrote:
> >> On 12/14, Kirill A. Shutemov wrote:
> >>>
> >>> On Mon, Dec 14, 2015 at
On Thu, Nov 26, 2015 at 06:27:01PM +0100, Dominik Dingel wrote:
> @@ -599,6 +603,10 @@ int fixup_user_fault(struct task_struct *tsk, struct
> mm_struct *mm,
> if (!(vm_flags & vma->vm_flags))
> return -EFAULT;
>
> + if (unlocked)
> + fault_flags |=
Hello Michael,
On Fri, Dec 04, 2015 at 04:50:03PM +0100, Michael Kerrisk (man-pages) wrote:
> Hi Andrea,
>
> On 09/11/2015 10:47 AM, Michael Kerrisk (man-pages) wrote:
> > On 05/14/2015 07:30 PM, Andrea Arcangeli wrote:
> >> Add documentation.
> >
> > Hi And
Hello Michael,
On Fri, Dec 04, 2015 at 04:50:03PM +0100, Michael Kerrisk (man-pages) wrote:
> Hi Andrea,
>
> On 09/11/2015 10:47 AM, Michael Kerrisk (man-pages) wrote:
> > On 05/14/2015 07:30 PM, Andrea Arcangeli wrote:
> >> Add documentation.
> >
> > Hi And
On Thu, Nov 26, 2015 at 06:27:01PM +0100, Dominik Dingel wrote:
> @@ -599,6 +603,10 @@ int fixup_user_fault(struct task_struct *tsk, struct
> mm_struct *mm,
> if (!(vm_flags & vma->vm_flags))
> return -EFAULT;
>
> + if (unlocked)
> + fault_flags |=
handler for the tracked pages. The
> > performance result of kernel building is as followings:
> >
> >before after
> > real 461.63 real 455.48
> > user 4529.55 user 4557.88
> > sys 1995.39 sys 1922.57
>
> For KVM-GT, as far a
handler for the tracked pages. The
> > performance result of kernel building is as followings:
> >
> >before after
> > real 461.63 real 455.48
> > user 4529.55 user 4557.88
> > sys 1995.39 sys 1922.57
>
> For KVM-GT, as far a
On Wed, Nov 18, 2015 at 01:29:38PM +0100, Jan Kara wrote:
> On Mon 16-11-15 19:35:14, Dave Hansen wrote:
> >
> > From: Dave Hansen
> >
> > get_user_pages_locked() appears to be for use when a caller needs
> > to know that its lock on mmap_sem was invalidated by the gup
> > call.
> >
> > But,
On Wed, Nov 18, 2015 at 01:29:38PM +0100, Jan Kara wrote:
> On Mon 16-11-15 19:35:14, Dave Hansen wrote:
> >
> > From: Dave Hansen
> >
> > get_user_pages_locked() appears to be for use when a caller needs
> > to know that its lock on mmap_sem was invalidated by the
On Thu, Oct 22, 2015 at 05:15:09PM +0200, Peter Zijlstra wrote:
> Indefinitely is such a long time, we should try and finish
> computation before the computer dies etc. :-)
Indefinitely as read_seqcount_retry, eventually it makes progress.
Even returning 0 from the page fault can trigger it
On Thu, Oct 22, 2015 at 03:38:24PM +0200, Peter Zijlstra wrote:
> On Thu, Oct 22, 2015 at 03:20:15PM +0200, Andrea Arcangeli wrote:
>
> > If schedule spontaneously wakes up a task in TASK_KILLABLE state that
> > would be a bug in the scheduler in my view. Luckily
On Thu, Oct 22, 2015 at 02:10:56PM +0200, Peter Zijlstra wrote:
> On Thu, May 14, 2015 at 07:31:11PM +0200, Andrea Arcangeli wrote:
> > @@ -255,21 +259,23 @@ int handle_userfault(struct vm_area_struct *vma,
> > unsigned long address,
> >
On Thu, Oct 22, 2015 at 03:38:24PM +0200, Peter Zijlstra wrote:
> On Thu, Oct 22, 2015 at 03:20:15PM +0200, Andrea Arcangeli wrote:
>
> > If schedule spontaneously wakes up a task in TASK_KILLABLE state that
> > would be a bug in the scheduler in my view. Luckily
On Thu, Oct 22, 2015 at 02:10:56PM +0200, Peter Zijlstra wrote:
> On Thu, May 14, 2015 at 07:31:11PM +0200, Andrea Arcangeli wrote:
> > @@ -255,21 +259,23 @@ int handle_userfault(struct vm_area_struct *vma,
> > unsigned long address,
> >
On Thu, Oct 22, 2015 at 05:15:09PM +0200, Peter Zijlstra wrote:
> Indefinitely is such a long time, we should try and finish
> computation before the computer dies etc. :-)
Indefinitely as read_seqcount_retry, eventually it makes progress.
Even returning 0 from the page fault can trigger it
Hello Patrick,
On Mon, Oct 12, 2015 at 11:04:11AM -0400, Patrick Donnelly wrote:
> Hello Andrea,
>
> On Mon, Jun 15, 2015 at 1:22 PM, Andrea Arcangeli wrote:
> > This is an incremental update to the userfaultfd code in -mm.
>
> Sorry I'm late to this party. I'm curious
Hello Patrick,
On Mon, Oct 12, 2015 at 11:04:11AM -0400, Patrick Donnelly wrote:
> Hello Andrea,
>
> On Mon, Jun 15, 2015 at 1:22 PM, Andrea Arcangeli <aarca...@redhat.com> wrote:
> > This is an incremental update to the userfaultfd code in -mm.
>
> Sorry I'm late to
separate
patch.
Reviewed-by: Andrea Arcangeli
Thanks,
Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
separate
patch.
Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>
Thanks,
Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Ple
Hello Chen,
On Sun, Oct 04, 2015 at 12:55:29PM +0800, Chen Gang wrote:
> Theoretically, the lock and unlock need to be symmetric, if we have to
> lock f_mapping all firstly, then lock all anon_vma, probably, we also
> need to unlock anon_vma all, then unlock all f_mapping.
They don't need to be
Hello Chen,
On Sun, Oct 04, 2015 at 12:55:29PM +0800, Chen Gang wrote:
> Theoretically, the lock and unlock need to be symmetric, if we have to
> lock f_mapping all firstly, then lock all anon_vma, probably, we also
> need to unlock anon_vma all, then unlock all f_mapping.
They don't need to be
During boot I get a div by zero Oops regression starting in v4.3-rc3.
Reviewed-by: Javi Merino
Signed-off-by: Andrea Arcangeli
---
drivers/thermal/power_allocator.c | 10 ++
1 file changed, 10 insertions(+)
diff --git a/drivers/thermal/power_allocator.c
b/drivers/thermal
Hello,
This is needed for my workstations or they Oops at boot since
v4.3-rc3.
Andrea Arcangeli (1):
thermal: avoid division by zero in power allocator
drivers/thermal/power_allocator.c | 10 ++
1 file changed, 10 insertions(+)
Thanks,
Andrea
--
To unsubscribe from this list: send
During boot I get a div by zero Oops regression starting in v4.3-rc3.
Reviewed-by: Javi Merino <javi.mer...@arm.com>
Signed-off-by: Andrea Arcangeli <aarca...@redhat.com>
---
drivers/thermal/power_allocator.c | 10 ++
1 file changed, 10 insertions(+)
diff --git a/dri
Hello,
This is needed for my workstations or they Oops at boot since
v4.3-rc3.
Andrea Arcangeli (1):
thermal: avoid division by zero in power allocator
drivers/thermal/power_allocator.c | 10 ++
1 file changed, 10 insertions(+)
Thanks,
Andrea
--
To unsubscribe from this list: send
On Thu, Sep 24, 2015 at 05:05:48PM +0200, Vlastimil Babka wrote:
> The problem is an endless loop in get_futex_key() when
> CONFIG_TRANSPARENT_HUGEPAGE is enabled and the s390x machine has emulated
> hugepages. The code tries to serialize against __split_huge_page_splitting(),
> but
On Thu, Sep 24, 2015 at 05:05:48PM +0200, Vlastimil Babka wrote:
> The problem is an endless loop in get_futex_key() when
> CONFIG_TRANSPARENT_HUGEPAGE is enabled and the s390x machine has emulated
> hugepages. The code tries to serialize against __split_huge_page_splitting(),
> but
On Tue, Sep 22, 2015 at 07:49:13AM -0600, Shuah Khan wrote:
> On 09/22/2015 04:45 AM, Andre Przywara wrote:
> > At the moment the userfaultfd test program only supports x86 and an
> > architecture called "powewrpc" ;-)
> > Fix that typo and add the syscall numbers for other architectures as
> >
On Tue, Sep 22, 2015 at 07:49:13AM -0600, Shuah Khan wrote:
> On 09/22/2015 04:45 AM, Andre Przywara wrote:
> > At the moment the userfaultfd test program only supports x86 and an
> > architecture called "powewrpc" ;-)
> > Fix that typo and add the syscall numbers for other architectures as
> >
Hello Eric,
On Sun, Sep 13, 2015 at 06:57:27PM -0500, Eric Biggers wrote:
> Signed-off-by: Eric Biggers
> ---
> fs/userfaultfd.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index 634e676..f9aeb40 100644
> ---
Hello Eric,
On Sun, Sep 13, 2015 at 06:57:27PM -0500, Eric Biggers wrote:
> Signed-off-by: Eric Biggers
> ---
> fs/userfaultfd.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index 634e676..f9aeb40 100644
On Wed, Sep 09, 2015 at 06:43:11PM +1000, Michael Ellerman wrote:
> On Tue, 2015-09-08 at 16:34 +0200, Andrea Arcangeli wrote:
> >
> > I already had a few minor changes queued to be submitted for arm and
> > ppc and a few updates to the selftest.
> >
> > I did
On Wed, Sep 09, 2015 at 06:43:11PM +1000, Michael Ellerman wrote:
> On Tue, 2015-09-08 at 16:34 +0200, Andrea Arcangeli wrote:
> >
> > I already had a few minor changes queued to be submitted for arm and
> > ppc and a few updates to the selftest.
> >
> > I did
On Tue, Sep 08, 2015 at 08:25:45PM +0800, Bamvor Zhang Jian wrote:
> Hi, Michael
>
> On 09/08/2015 05:54 PM, Michael Ellerman wrote:
> > On Tue, 2015-09-08 at 17:15 +0800, Bamvor Zhang Jian wrote:
> >> Hi, Michael
> >>
> >> I thought I reply to you, but ...
> >>
> >> On 08/31/2015 11:26 AM,
On Tue, Sep 08, 2015 at 08:25:45PM +0800, Bamvor Zhang Jian wrote:
> Hi, Michael
>
> On 09/08/2015 05:54 PM, Michael Ellerman wrote:
> > On Tue, 2015-09-08 at 17:15 +0800, Bamvor Zhang Jian wrote:
> >> Hi, Michael
> >>
> >> I thought I reply to you, but ...
> >>
> >> On 08/31/2015 11:26 AM,
or and compound_order into one word in struct page
> mm: make compound_head() robust
> mm: use 'unsigned int' for page order
> mm: use 'unsigned int' for compound_dtor/compound_order on 64BIT
Reviewed-by: Andrea Arcangeli
The only other alternative solution that doesn't require f
or and compound_order into one word in struct page
> mm: make compound_head() robust
> mm: use 'unsigned int' for page order
> mm: use 'unsigned int' for compound_dtor/compound_order on 64BIT
Reviewed-by: Andrea Arcangeli <aarca...@redhat.com>
The only other alternative sol
e
>
>
> -#define __NR_syscalls364
> +#define __NR_syscalls365
>
> #define __NR__exit __NR_exit
> #define NR_syscalls __NR_syscalls
Reviewed-by: Andrea Arcangeli
--
To unsubscribe from this list: send the line "unsubscribe linux-ker
364
+#define __NR_syscalls365
#define __NR__exit __NR_exit
#define NR_syscalls __NR_syscalls
Reviewed-by: Andrea Arcangeli aarca...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord
On Wed, Jul 29, 2015 at 08:46:10PM +0200, Thomas Gleixner wrote:
> On Wed, 29 Jul 2015, Andy Lutomirski wrote:
> > -tip people: want to assign Andrea a pair of syscall numbers?
>
> Sure, just send a patch
Awesome, I just sent the patch to register the syscall against -tip
with the usual
This registers the official numbers of the userfaultfd syscall for x86
32bit and x86-64 64bit. This registration allows to ship kernels in
production using these two syscall numbers for userfaultfd.
Acked-by: Pavel Emelyanov
Signed-off-by: Andrea Arcangeli
---
arch/x86/entry/syscalls
On Wed, Jul 29, 2015 at 08:46:10PM +0200, Thomas Gleixner wrote:
On Wed, 29 Jul 2015, Andy Lutomirski wrote:
-tip people: want to assign Andrea a pair of syscall numbers?
Sure, just send a patch
Awesome, I just sent the patch to register the syscall against -tip
with the usual
This registers the official numbers of the userfaultfd syscall for x86
32bit and x86-64 64bit. This registration allows to ship kernels in
production using these two syscall numbers for userfaultfd.
Acked-by: Pavel Emelyanov xe...@parallels.com
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
ssible never mind.
Thanks,
Andrea
===
>From 873093c32b4b1d0b6c3f18ec1e52b56c24f67457 Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli
Date: Wed, 29 Jul 2015 18:53:17 +0200
Subject: [PATCH] userfaultfd: selftest: update userfaultfd x86 32bit syscall
number
It changed as result of linux-n
never mind.
Thanks,
Andrea
===
From 873093c32b4b1d0b6c3f18ec1e52b56c24f67457 Mon Sep 17 00:00:00 2001
From: Andrea Arcangeli aarca...@redhat.com
Date: Wed, 29 Jul 2015 18:53:17 +0200
Subject: [PATCH] userfaultfd: selftest: update userfaultfd x86 32bit syscall
number
It changed as result of linux
On Thu, Jul 23, 2015 at 09:55:33AM -0700, Dave Hansen wrote:
> On 07/23/2015 09:16 AM, Catalin Marinas wrote:
> > Anyway, if you want to keep the option of a full TLB flush for x86 on
> > huge pages, I'm happy to repost a v2 with a separate
> > flush_tlb_pmd_huge_page that arch code can define as
On Thu, Jul 23, 2015 at 07:41:24AM -0700, Dave Hansen wrote:
> We had a discussion about this a few weeks ago:
>
> https://lkml.org/lkml/2015/6/25/666
>
> The argument is that the CPU is so good at refilling the TLB that it
> rarely waits on it, so the "cost" can be very very low.
That
On Thu, Jul 23, 2015 at 11:49:38AM +0100, Catalin Marinas wrote:
> On Thu, Jul 23, 2015 at 12:05:21AM +0100, Dave Hansen wrote:
> > On 07/22/2015 03:48 PM, Catalin Marinas wrote:
> > > You are right, on x86 the tlb_single_page_flush_ceiling seems to be
> > > 33, so for an HPAGE_SIZE range the code
On Thu, Jul 23, 2015 at 11:49:38AM +0100, Catalin Marinas wrote:
On Thu, Jul 23, 2015 at 12:05:21AM +0100, Dave Hansen wrote:
On 07/22/2015 03:48 PM, Catalin Marinas wrote:
You are right, on x86 the tlb_single_page_flush_ceiling seems to be
33, so for an HPAGE_SIZE range the code does a
On Thu, Jul 23, 2015 at 07:41:24AM -0700, Dave Hansen wrote:
We had a discussion about this a few weeks ago:
https://lkml.org/lkml/2015/6/25/666
The argument is that the CPU is so good at refilling the TLB that it
rarely waits on it, so the cost can be very very low.
That was about
On Thu, Jul 23, 2015 at 09:55:33AM -0700, Dave Hansen wrote:
On 07/23/2015 09:16 AM, Catalin Marinas wrote:
Anyway, if you want to keep the option of a full TLB flush for x86 on
huge pages, I'm happy to repost a v2 with a separate
flush_tlb_pmd_huge_page that arch code can define as it sees
Hi Dave,
On Tue, Jun 23, 2015 at 12:00:19PM -0700, Dave Hansen wrote:
> Down in userfaultfd_wake_function(), it looks like you intended for a
> len=0 to mean "wake all". But the validate_range() that we do from
> userspace has a !len check in it, which keeps us from passing a len=0 in
> from
Hi Dave,
On Tue, Jun 23, 2015 at 12:00:19PM -0700, Dave Hansen wrote:
Down in userfaultfd_wake_function(), it looks like you intended for a
len=0 to mean wake all. But the validate_range() that we do from
userspace has a !len check in it, which keeps us from passing a len=0 in
from
On Mon, Jun 15, 2015 at 08:41:24PM -1000, Linus Torvalds wrote:
> On Mon, Jun 15, 2015 at 12:19 PM, Andrea Arcangeli
> wrote:
> >
> > Yes, it would leave the other blocked, how is it different from having
> > just 1 reader and it gets killed?
>
> Either is complet
On Mon, Jun 15, 2015 at 08:41:24PM -1000, Linus Torvalds wrote:
On Mon, Jun 15, 2015 at 12:19 PM, Andrea Arcangeli aarca...@redhat.com
wrote:
Yes, it would leave the other blocked, how is it different from having
just 1 reader and it gets killed?
Either is completely wrong
On Mon, Jun 15, 2015 at 08:19:07AM -1000, Linus Torvalds wrote:
> What if the process doing the polling never doors anything with the end
> result? Maybe it meant to, but it got killed before it could? Are you going
> to leave everybody else blocked, even though there are pending events?
Yes, it
On Mon, Jun 15, 2015 at 08:11:50AM -1000, Linus Torvalds wrote:
> On Jun 15, 2015 7:22 AM, "Andrea Arcangeli" wrote:
> >
> > + if (cmd != UFFDIO_API) {
> > + if (ctx->state == UFFD_STATE_WAIT_API)
> > + return
# 256.660 M/sec
( +- 0.71% ) [83.69%]
59,203,898 branch-misses #0.51% of all branches
( +- 2.03% ) [83.54%]
2.600912438 seconds time elapsed
( +- 0.02% )
Signed-off-by: Andrea Arcangeli
failure because the wrong page was
being copied.
For various reasons this wasn't easily reproducible in the qemu
workload, but the strestest exposed the problem immediately.
Signed-off-by: Andrea Arcangeli
---
mm/huge_memory.c | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff
all, has wait->flags
WQ_FLAG_EXCLUSIVE set.
Signed-off-by: Andrea Arcangeli
---
fs/userfaultfd.c | 8
include/linux/wait.h | 5 ++---
kernel/sched/wait.c | 7 +++
net/sunrpc/sched.c | 2 +-
4 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/fs/userfaultfd.c b
by userfaultfd. The fix for those two bugs was also
strightforward and required no design change of any sort.
Signed-off-by: Andrea Arcangeli
---
tools/testing/selftests/vm/Makefile | 4 +-
tools/testing/selftests/vm/userfaultfd.c | 669 +++
2 files changed
During the refile in userfaultfd_read both waitqueues could look empty
to the lockless wake_userfault(). Use a seqcount to prevent this false
negative that could leave an userfault blocked.
Signed-off-by: Andrea Arcangeli
---
fs/userfaultfd.c | 26 --
1 file changed, 24
(all but UFFDIO_API/struct
uffdio_api) with a bump of uffdio_api.api.
There's no actual plan or need to change the API or the ioctl, the
current API already should cover fine even the non cooperative usage,
but this is just for the longer term future just in case.
Signed-off-by: Andrea Arcangeli
nal processed, coredumps always
worked perfectly with userfaults, no matter if the userfault is triggered by
GUP a kernel copy_user or directly from userland.
Signed-off-by: Andrea Arcangeli
---
fs/userfaultfd.c | 35 ---
1 file changed, 32 insertions(+), 3 deleti
ruled out either yet.
Andrea Arcangeli (7):
userfaultfd: require UFFDIO_API before other ioctls
userfaultfd: propagate the full address in THP faults
userfaultfd: allow signals to interrupt a userfault
userfaultfd: avoid missing wakeups during refile in userfaultfd_read
userfaultfd:
.
Andrea Arcangeli (7):
userfaultfd: require UFFDIO_API before other ioctls
userfaultfd: propagate the full address in THP faults
userfaultfd: allow signals to interrupt a userfault
userfaultfd: avoid missing wakeups during refile in userfaultfd_read
userfaultfd: switch to exclusive wakeup
signal processed, coredumps always
worked perfectly with userfaults, no matter if the userfault is triggered by
GUP a kernel copy_user or directly from userland.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
fs/userfaultfd.c | 35 ---
1 file changed, 32
(all but UFFDIO_API/struct
uffdio_api) with a bump of uffdio_api.api.
There's no actual plan or need to change the API or the ioctl, the
current API already should cover fine even the non cooperative usage,
but this is just for the longer term future just in case.
Signed-off-by: Andrea Arcangeli
During the refile in userfaultfd_read both waitqueues could look empty
to the lockless wake_userfault(). Use a seqcount to prevent this false
negative that could leave an userfault blocked.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
fs/userfaultfd.c | 26
by userfaultfd. The fix for those two bugs was also
strightforward and required no design change of any sort.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
tools/testing/selftests/vm/Makefile | 4 +-
tools/testing/selftests/vm/userfaultfd.c | 669
# 256.660 M/sec
( +- 0.71% ) [83.69%]
59,203,898 branch-misses #0.51% of all branches
( +- 2.03% ) [83.54%]
2.600912438 seconds time elapsed
( +- 0.02% )
Signed-off-by: Andrea Arcangeli
WQ_FLAG_EXCLUSIVE set.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
fs/userfaultfd.c | 8
include/linux/wait.h | 5 ++---
kernel/sched/wait.c | 7 +++
net/sunrpc/sched.c | 2 +-
4 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/fs/userfaultfd.c b/fs
failure because the wrong page was
being copied.
For various reasons this wasn't easily reproducible in the qemu
workload, but the strestest exposed the problem immediately.
Signed-off-by: Andrea Arcangeli aarca...@redhat.com
---
mm/huge_memory.c | 10 ++
1 file changed, 6 insertions(+), 4
On Mon, Jun 15, 2015 at 08:11:50AM -1000, Linus Torvalds wrote:
On Jun 15, 2015 7:22 AM, Andrea Arcangeli aarca...@redhat.com wrote:
+ if (cmd != UFFDIO_API) {
+ if (ctx-state == UFFD_STATE_WAIT_API)
+ return -EINVAL;
+ BUG_ON(ctx
801 - 900 of 3668 matches
Mail list logo