Re: [RESEND 4/7] mm/gup: Add FOLL_LONGTERM capability to GUP fast

2019-03-25 Thread Ira Weiny
On Mon, Mar 25, 2019 at 03:36:28PM -0700, Dan Williams wrote:
> On Mon, Mar 25, 2019 at 3:22 PM Ira Weiny  wrote:
> [..]
> > FWIW this thread is making me think my original patch which simply 
> > implemented
> > get_user_pages_fast_longterm() would be more clear.  There is some evidence
> > that the GUP API was trending that way (see get_user_pages_remote).  That 
> > seems
> > wrong but I don't know how to ensure users don't specify the wrong flag.
> 
> What about just making the existing get_user_pages_longterm() have a
> fast path option?

That would work but was not the direction we agreed upon before.[1]

At this point I would rather see this patch set applied, focus on fixing the
filesystem issues, and once that is done determine if FOLL_LONGTERM is needed
in any GUP calls.

Ira

[1] https://lkml.org/lkml/2019/2/11/2038



Re: [RESEND 4/7] mm/gup: Add FOLL_LONGTERM capability to GUP fast

2019-03-25 Thread Dan Williams
On Mon, Mar 25, 2019 at 3:22 PM Ira Weiny  wrote:
[..]
> FWIW this thread is making me think my original patch which simply implemented
> get_user_pages_fast_longterm() would be more clear.  There is some evidence
> that the GUP API was trending that way (see get_user_pages_remote).  That 
> seems
> wrong but I don't know how to ensure users don't specify the wrong flag.

What about just making the existing get_user_pages_longterm() have a
fast path option?


Re: [RESEND 4/7] mm/gup: Add FOLL_LONGTERM capability to GUP fast

2019-03-25 Thread Ira Weiny
On Mon, Mar 25, 2019 at 02:51:50PM -0300, Jason Gunthorpe wrote:
> On Mon, Mar 25, 2019 at 02:23:15AM -0700, Ira Weiny wrote:
> > > > Unfortunately holding the lock is required to support FOLL_LONGTERM (to 
> > > > check
> > > > the VMAs) but we don't want to hold the lock to be optimal 
> > > > (specifically allow
> > > > FAULT_FOLL_ALLOW_RETRY).  So I'm maintaining the optimization for 
> > > > *_fast users
> > > > who do not specify FOLL_LONGTERM.
> > > > 
> > > > Another way to do this would have been to define 
> > > > __gup_longterm_unlocked with
> > > > the above logic, but that seemed overkill at this point.
> > > 
> > > get_user_pages_unlocked() is an exported symbol, shouldn't it work
> > > with the FOLL_LONGTERM flag?
> > > 
> > > I think it should even though we have no user..
> > > 
> > > Otherwise the GUP API just gets more confusing.
> > 
> > I agree WRT to the API.  But I think callers of get_user_pages_unlocked() 
> > are
> > not going to get the behavior they want if they specify FOLL_LONGTERM.
> 
> Oh? Isn't the only thing FOLL_LONGTERM does is block the call on DAX?

>From an API yes.

> Why does the locking mode matter to this test?

DAX checks for VMA's being Filesystem DAX.  Therefore, it requires collection
of VMA's as the GUP code executes.  The unlocked version can drop the lock and
therefore the VMAs may become invalid.  Therefore, the 2 code paths are
incompatible.

Users of GUP unlocked are going to want the benefit of FAULT_FOLL_ALLOW_RETRY.
So I don't anticipate anyone using FOLL_LONGTERM with
get_user_pages_unlocked().

FWIW this thread is making me think my original patch which simply implemented
get_user_pages_fast_longterm() would be more clear.  There is some evidence
that the GUP API was trending that way (see get_user_pages_remote).  That seems
wrong but I don't know how to ensure users don't specify the wrong flag.

> 
> > What I could do is BUG_ON (or just WARN_ON) if unlocked is called with
> > FOLL_LONGTERM similar to the code in get_user_pages_locked() which does not
> > allow locked and vmas to be passed together:
> 
> The GUP call should fail if you are doing something like this. But I'd
> rather not see confusing specialc cases in code without a clear
> comment explaining why it has to be there.

Code comment would be necessary, sure.  Was just throwing ideas out there.

Ira



Re: [RESEND 4/7] mm/gup: Add FOLL_LONGTERM capability to GUP fast

2019-03-25 Thread Jason Gunthorpe
On Mon, Mar 25, 2019 at 02:23:15AM -0700, Ira Weiny wrote:
> > > Unfortunately holding the lock is required to support FOLL_LONGTERM (to 
> > > check
> > > the VMAs) but we don't want to hold the lock to be optimal (specifically 
> > > allow
> > > FAULT_FOLL_ALLOW_RETRY).  So I'm maintaining the optimization for *_fast 
> > > users
> > > who do not specify FOLL_LONGTERM.
> > > 
> > > Another way to do this would have been to define __gup_longterm_unlocked 
> > > with
> > > the above logic, but that seemed overkill at this point.
> > 
> > get_user_pages_unlocked() is an exported symbol, shouldn't it work
> > with the FOLL_LONGTERM flag?
> > 
> > I think it should even though we have no user..
> > 
> > Otherwise the GUP API just gets more confusing.
> 
> I agree WRT to the API.  But I think callers of get_user_pages_unlocked() are
> not going to get the behavior they want if they specify FOLL_LONGTERM.

Oh? Isn't the only thing FOLL_LONGTERM does is block the call on DAX?
Why does the locking mode matter to this test?

> What I could do is BUG_ON (or just WARN_ON) if unlocked is called with
> FOLL_LONGTERM similar to the code in get_user_pages_locked() which does not
> allow locked and vmas to be passed together:

The GUP call should fail if you are doing something like this. But I'd
rather not see confusing specialc cases in code without a clear
comment explaining why it has to be there.

Jason


Re: [RESEND 4/7] mm/gup: Add FOLL_LONGTERM capability to GUP fast

2019-03-25 Thread Ira Weiny
On Mon, Mar 25, 2019 at 01:47:13PM -0300, Jason Gunthorpe wrote:
> On Mon, Mar 25, 2019 at 01:42:26AM -0700, Ira Weiny wrote:
> > On Fri, Mar 22, 2019 at 03:12:55PM -0700, Dan Williams wrote:
> > > On Sun, Mar 17, 2019 at 7:36 PM  wrote:
> > > >
> > > > From: Ira Weiny 
> > > >
> > > > DAX pages were previously unprotected from longterm pins when users
> > > > called get_user_pages_fast().
> > > >
> > > > Use the new FOLL_LONGTERM flag to check for DEVMAP pages and fall
> > > > back to regular GUP processing if a DEVMAP page is encountered.
> > > >
> > > > Signed-off-by: Ira Weiny 
> > > >  mm/gup.c | 29 +
> > > >  1 file changed, 25 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/mm/gup.c b/mm/gup.c
> > > > index 0684a9536207..173db0c44678 100644
> > > > +++ b/mm/gup.c
> > > > @@ -1600,6 +1600,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long 
> > > > addr, unsigned long end,
> > > > goto pte_unmap;
> > > >
> > > > if (pte_devmap(pte)) {
> > > > +   if (unlikely(flags & FOLL_LONGTERM))
> > > > +   goto pte_unmap;
> > > > +
> > > > pgmap = get_dev_pagemap(pte_pfn(pte), pgmap);
> > > > if (unlikely(!pgmap)) {
> > > > undo_dev_pagemap(nr, nr_start, pages);
> > > > @@ -1739,8 +1742,11 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, 
> > > > unsigned long addr,
> > > > if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
> > > > return 0;
> > > >
> > > > -   if (pmd_devmap(orig))
> > > > +   if (pmd_devmap(orig)) {
> > > > +   if (unlikely(flags & FOLL_LONGTERM))
> > > > +   return 0;
> > > > return __gup_device_huge_pmd(orig, pmdp, addr, end, 
> > > > pages, nr);
> > > > +   }
> > > >
> > > > refs = 0;
> > > > page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
> > > > @@ -1777,8 +1783,11 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, 
> > > > unsigned long addr,
> > > > if (!pud_access_permitted(orig, flags & FOLL_WRITE))
> > > > return 0;
> > > >
> > > > -   if (pud_devmap(orig))
> > > > +   if (pud_devmap(orig)) {
> > > > +   if (unlikely(flags & FOLL_LONGTERM))
> > > > +   return 0;
> > > > return __gup_device_huge_pud(orig, pudp, addr, end, 
> > > > pages, nr);
> > > > +   }
> > > >
> > > > refs = 0;
> > > > page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
> > > > @@ -2066,8 +2075,20 @@ int get_user_pages_fast(unsigned long start, int 
> > > > nr_pages,
> > > > start += nr << PAGE_SHIFT;
> > > > pages += nr;
> > > >
> > > > -   ret = get_user_pages_unlocked(start, nr_pages - nr, 
> > > > pages,
> > > > - gup_flags);
> > > > +   if (gup_flags & FOLL_LONGTERM) {
> > > > +   down_read(¤t->mm->mmap_sem);
> > > > +   ret = __gup_longterm_locked(current, 
> > > > current->mm,
> > > > +   start, nr_pages - 
> > > > nr,
> > > > +   pages, NULL, 
> > > > gup_flags);
> > > > +   up_read(¤t->mm->mmap_sem);
> > > > +   } else {
> > > > +   /*
> > > > +* retain FAULT_FOLL_ALLOW_RETRY optimization if
> > > > +* possible
> > > > +*/
> > > > +   ret = get_user_pages_unlocked(start, nr_pages - 
> > > > nr,
> > > > + pages, gup_flags);
> > > 
> > > I couldn't immediately grok why this path needs to branch on
> > > FOLL_LONGTERM? Won't get_user_pages_unlocked(..., FOLL_LONGTERM) do
> > > the right thing?
> > 
> > Unfortunately holding the lock is required to support FOLL_LONGTERM (to 
> > check
> > the VMAs) but we don't want to hold the lock to be optimal (specifically 
> > allow
> > FAULT_FOLL_ALLOW_RETRY).  So I'm maintaining the optimization for *_fast 
> > users
> > who do not specify FOLL_LONGTERM.
> > 
> > Another way to do this would have been to define __gup_longterm_unlocked 
> > with
> > the above logic, but that seemed overkill at this point.
> 
> get_user_pages_unlocked() is an exported symbol, shouldn't it work
> with the FOLL_LONGTERM flag?
> 
> I think it should even though we have no user..
> 
> Otherwise the GUP API just gets more confusing.

I agree WRT to the API.  But I think callers of get_user_pages_unlocked() are
not going to get the behavior they want if they specify FOLL_LONGTERM.

What I could do is BUG_ON (or just WARN_ON) if unlocked is called with
FOLL_LONGTERM similar to the code in get_user_pages_locked() which does not
allow locked and

Re: [RESEND 4/7] mm/gup: Add FOLL_LONGTERM capability to GUP fast

2019-03-25 Thread Jason Gunthorpe
On Mon, Mar 25, 2019 at 01:42:26AM -0700, Ira Weiny wrote:
> On Fri, Mar 22, 2019 at 03:12:55PM -0700, Dan Williams wrote:
> > On Sun, Mar 17, 2019 at 7:36 PM  wrote:
> > >
> > > From: Ira Weiny 
> > >
> > > DAX pages were previously unprotected from longterm pins when users
> > > called get_user_pages_fast().
> > >
> > > Use the new FOLL_LONGTERM flag to check for DEVMAP pages and fall
> > > back to regular GUP processing if a DEVMAP page is encountered.
> > >
> > > Signed-off-by: Ira Weiny 
> > >  mm/gup.c | 29 +
> > >  1 file changed, 25 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/mm/gup.c b/mm/gup.c
> > > index 0684a9536207..173db0c44678 100644
> > > +++ b/mm/gup.c
> > > @@ -1600,6 +1600,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long 
> > > addr, unsigned long end,
> > > goto pte_unmap;
> > >
> > > if (pte_devmap(pte)) {
> > > +   if (unlikely(flags & FOLL_LONGTERM))
> > > +   goto pte_unmap;
> > > +
> > > pgmap = get_dev_pagemap(pte_pfn(pte), pgmap);
> > > if (unlikely(!pgmap)) {
> > > undo_dev_pagemap(nr, nr_start, pages);
> > > @@ -1739,8 +1742,11 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, 
> > > unsigned long addr,
> > > if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
> > > return 0;
> > >
> > > -   if (pmd_devmap(orig))
> > > +   if (pmd_devmap(orig)) {
> > > +   if (unlikely(flags & FOLL_LONGTERM))
> > > +   return 0;
> > > return __gup_device_huge_pmd(orig, pmdp, addr, end, 
> > > pages, nr);
> > > +   }
> > >
> > > refs = 0;
> > > page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
> > > @@ -1777,8 +1783,11 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, 
> > > unsigned long addr,
> > > if (!pud_access_permitted(orig, flags & FOLL_WRITE))
> > > return 0;
> > >
> > > -   if (pud_devmap(orig))
> > > +   if (pud_devmap(orig)) {
> > > +   if (unlikely(flags & FOLL_LONGTERM))
> > > +   return 0;
> > > return __gup_device_huge_pud(orig, pudp, addr, end, 
> > > pages, nr);
> > > +   }
> > >
> > > refs = 0;
> > > page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
> > > @@ -2066,8 +2075,20 @@ int get_user_pages_fast(unsigned long start, int 
> > > nr_pages,
> > > start += nr << PAGE_SHIFT;
> > > pages += nr;
> > >
> > > -   ret = get_user_pages_unlocked(start, nr_pages - nr, pages,
> > > - gup_flags);
> > > +   if (gup_flags & FOLL_LONGTERM) {
> > > +   down_read(¤t->mm->mmap_sem);
> > > +   ret = __gup_longterm_locked(current, current->mm,
> > > +   start, nr_pages - nr,
> > > +   pages, NULL, 
> > > gup_flags);
> > > +   up_read(¤t->mm->mmap_sem);
> > > +   } else {
> > > +   /*
> > > +* retain FAULT_FOLL_ALLOW_RETRY optimization if
> > > +* possible
> > > +*/
> > > +   ret = get_user_pages_unlocked(start, nr_pages - 
> > > nr,
> > > + pages, gup_flags);
> > 
> > I couldn't immediately grok why this path needs to branch on
> > FOLL_LONGTERM? Won't get_user_pages_unlocked(..., FOLL_LONGTERM) do
> > the right thing?
> 
> Unfortunately holding the lock is required to support FOLL_LONGTERM (to check
> the VMAs) but we don't want to hold the lock to be optimal (specifically allow
> FAULT_FOLL_ALLOW_RETRY).  So I'm maintaining the optimization for *_fast users
> who do not specify FOLL_LONGTERM.
> 
> Another way to do this would have been to define __gup_longterm_unlocked with
> the above logic, but that seemed overkill at this point.

get_user_pages_unlocked() is an exported symbol, shouldn't it work
with the FOLL_LONGTERM flag?

I think it should even though we have no user..

Otherwise the GUP API just gets more confusing.

Jason


Re: [RESEND 4/7] mm/gup: Add FOLL_LONGTERM capability to GUP fast

2019-03-25 Thread Ira Weiny
On Fri, Mar 22, 2019 at 03:12:55PM -0700, Dan Williams wrote:
> On Sun, Mar 17, 2019 at 7:36 PM  wrote:
> >
> > From: Ira Weiny 
> >
> > DAX pages were previously unprotected from longterm pins when users
> > called get_user_pages_fast().
> >
> > Use the new FOLL_LONGTERM flag to check for DEVMAP pages and fall
> > back to regular GUP processing if a DEVMAP page is encountered.
> >
> > Signed-off-by: Ira Weiny 
> > ---
> >  mm/gup.c | 29 +
> >  1 file changed, 25 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/gup.c b/mm/gup.c
> > index 0684a9536207..173db0c44678 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -1600,6 +1600,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long 
> > addr, unsigned long end,
> > goto pte_unmap;
> >
> > if (pte_devmap(pte)) {
> > +   if (unlikely(flags & FOLL_LONGTERM))
> > +   goto pte_unmap;
> > +
> > pgmap = get_dev_pagemap(pte_pfn(pte), pgmap);
> > if (unlikely(!pgmap)) {
> > undo_dev_pagemap(nr, nr_start, pages);
> > @@ -1739,8 +1742,11 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, 
> > unsigned long addr,
> > if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
> > return 0;
> >
> > -   if (pmd_devmap(orig))
> > +   if (pmd_devmap(orig)) {
> > +   if (unlikely(flags & FOLL_LONGTERM))
> > +   return 0;
> > return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, 
> > nr);
> > +   }
> >
> > refs = 0;
> > page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
> > @@ -1777,8 +1783,11 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, 
> > unsigned long addr,
> > if (!pud_access_permitted(orig, flags & FOLL_WRITE))
> > return 0;
> >
> > -   if (pud_devmap(orig))
> > +   if (pud_devmap(orig)) {
> > +   if (unlikely(flags & FOLL_LONGTERM))
> > +   return 0;
> > return __gup_device_huge_pud(orig, pudp, addr, end, pages, 
> > nr);
> > +   }
> >
> > refs = 0;
> > page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
> > @@ -2066,8 +2075,20 @@ int get_user_pages_fast(unsigned long start, int 
> > nr_pages,
> > start += nr << PAGE_SHIFT;
> > pages += nr;
> >
> > -   ret = get_user_pages_unlocked(start, nr_pages - nr, pages,
> > - gup_flags);
> > +   if (gup_flags & FOLL_LONGTERM) {
> > +   down_read(¤t->mm->mmap_sem);
> > +   ret = __gup_longterm_locked(current, current->mm,
> > +   start, nr_pages - nr,
> > +   pages, NULL, gup_flags);
> > +   up_read(¤t->mm->mmap_sem);
> > +   } else {
> > +   /*
> > +* retain FAULT_FOLL_ALLOW_RETRY optimization if
> > +* possible
> > +*/
> > +   ret = get_user_pages_unlocked(start, nr_pages - nr,
> > + pages, gup_flags);
> 
> I couldn't immediately grok why this path needs to branch on
> FOLL_LONGTERM? Won't get_user_pages_unlocked(..., FOLL_LONGTERM) do
> the right thing?

Unfortunately holding the lock is required to support FOLL_LONGTERM (to check
the VMAs) but we don't want to hold the lock to be optimal (specifically allow
FAULT_FOLL_ALLOW_RETRY).  So I'm maintaining the optimization for *_fast users
who do not specify FOLL_LONGTERM.

Another way to do this would have been to define __gup_longterm_unlocked with
the above logic, but that seemed overkill at this point.

Ira



Re: [RESEND 4/7] mm/gup: Add FOLL_LONGTERM capability to GUP fast

2019-03-22 Thread Dan Williams
On Sun, Mar 17, 2019 at 7:36 PM  wrote:
>
> From: Ira Weiny 
>
> DAX pages were previously unprotected from longterm pins when users
> called get_user_pages_fast().
>
> Use the new FOLL_LONGTERM flag to check for DEVMAP pages and fall
> back to regular GUP processing if a DEVMAP page is encountered.
>
> Signed-off-by: Ira Weiny 
> ---
>  mm/gup.c | 29 +
>  1 file changed, 25 insertions(+), 4 deletions(-)
>
> diff --git a/mm/gup.c b/mm/gup.c
> index 0684a9536207..173db0c44678 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -1600,6 +1600,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, 
> unsigned long end,
> goto pte_unmap;
>
> if (pte_devmap(pte)) {
> +   if (unlikely(flags & FOLL_LONGTERM))
> +   goto pte_unmap;
> +
> pgmap = get_dev_pagemap(pte_pfn(pte), pgmap);
> if (unlikely(!pgmap)) {
> undo_dev_pagemap(nr, nr_start, pages);
> @@ -1739,8 +1742,11 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, 
> unsigned long addr,
> if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
> return 0;
>
> -   if (pmd_devmap(orig))
> +   if (pmd_devmap(orig)) {
> +   if (unlikely(flags & FOLL_LONGTERM))
> +   return 0;
> return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, 
> nr);
> +   }
>
> refs = 0;
> page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
> @@ -1777,8 +1783,11 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, 
> unsigned long addr,
> if (!pud_access_permitted(orig, flags & FOLL_WRITE))
> return 0;
>
> -   if (pud_devmap(orig))
> +   if (pud_devmap(orig)) {
> +   if (unlikely(flags & FOLL_LONGTERM))
> +   return 0;
> return __gup_device_huge_pud(orig, pudp, addr, end, pages, 
> nr);
> +   }
>
> refs = 0;
> page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
> @@ -2066,8 +2075,20 @@ int get_user_pages_fast(unsigned long start, int 
> nr_pages,
> start += nr << PAGE_SHIFT;
> pages += nr;
>
> -   ret = get_user_pages_unlocked(start, nr_pages - nr, pages,
> - gup_flags);
> +   if (gup_flags & FOLL_LONGTERM) {
> +   down_read(¤t->mm->mmap_sem);
> +   ret = __gup_longterm_locked(current, current->mm,
> +   start, nr_pages - nr,
> +   pages, NULL, gup_flags);
> +   up_read(¤t->mm->mmap_sem);
> +   } else {
> +   /*
> +* retain FAULT_FOLL_ALLOW_RETRY optimization if
> +* possible
> +*/
> +   ret = get_user_pages_unlocked(start, nr_pages - nr,
> + pages, gup_flags);

I couldn't immediately grok why this path needs to branch on
FOLL_LONGTERM? Won't get_user_pages_unlocked(..., FOLL_LONGTERM) do
the right thing?