Re: [PATCH v4 09/23] mm/gup: introduce pin_user_pages*() and FOLL_PIN

2019-11-13 Thread John Hubbard
On 11/13/19 3:22 PM, John Hubbard wrote:
> On 11/13/19 2:43 AM, Jan Kara wrote:
> ...
>> How does FOLL_PIN result in grabbing (at least normal, for now) page 
>> reference?
>> I didn't find that anywhere in this patch but it is a prerequisite to
>> converting any user to pin_user_pages() interface, right?
> 
> 
> ohhh, I messed up on this intermediate patch: it doesn't quite stand alone as
> it should, as you noticed. To correct this, I can do one of the following:
> 
> a) move the new pin*() routines into the later patch 16 ("mm/gup:
> track FOLL_PIN pages"), or
> 
> b) do a temporary thing here, such as setting FOLL_GET and adding a TODO,
> within the pin*() implementations. And this switching it over to FOLL_PIN
> in patch 16.
> 
> I'm thinking (a) is less error-prone, so I'm going with that unless someone
> points out that that is stupid. :)
> 

OK, just to save anyone from wasting time reading the above: (a) is, in fact,
stupid, after all. ha. That is because pin_user_pages() is called in the 
intervening patches.
 
So anyway, I'll work out an ordering to fix it up, it's not complicated.


thanks,
-- 
John Hubbard
NVIDIA



Re: [PATCH v4 09/23] mm/gup: introduce pin_user_pages*() and FOLL_PIN

2019-11-13 Thread John Hubbard

On 11/13/19 2:43 AM, Jan Kara wrote:
...

How does FOLL_PIN result in grabbing (at least normal, for now) page reference?
I didn't find that anywhere in this patch but it is a prerequisite to
converting any user to pin_user_pages() interface, right?



ohhh, I messed up on this intermediate patch: it doesn't quite stand alone as
it should, as you noticed. To correct this, I can do one of the following:

a) move the new pin*() routines into the later patch 16 ("mm/gup:
track FOLL_PIN pages"), or

b) do a temporary thing here, such as setting FOLL_GET and adding a TODO,
within the pin*() implementations. And this switching it over to FOLL_PIN
in patch 16.

I'm thinking (a) is less error-prone, so I'm going with that unless someone
points out that that is stupid. :)


...

I was somewhat wondering about the number of functions you add here. So we
have:> 
pin_user_pages()

pin_user_pages_fast()
pin_user_pages_remote()

and then longterm variants:

pin_longterm_pages()
pin_longterm_pages_fast()
pin_longterm_pages_remote()

and obviously we have gup like:
get_user_pages()
get_user_pages_fast()
get_user_pages_remote()
... and some other gup variants ...

I think we really should have pin_* vs get_* variants as they are very
different in terms of guarantees and after conversion, any use of get_*
variant in non-mm code should be closely scrutinized. OTOH pin_longterm_*
don't look *that* useful to me and just using pin_* instead with
FOLL_LONGTERM flag would look OK to me and somewhat reduce the number of
functions which is already large enough? What do people think? I don't feel
too strongly about this but wanted to bring this up.

Honza


Sounds just right to me, and I see that Dan and Ira also like it.
So I'll proceed with that.

thanks,
--
John Hubbard
NVIDIA


Re: [PATCH v4 09/23] mm/gup: introduce pin_user_pages*() and FOLL_PIN

2019-11-13 Thread John Hubbard

On 11/13/19 10:59 AM, Ira Weiny wrote:

On Tue, Nov 12, 2019 at 08:26:56PM -0800, John Hubbard wrote:

Introduce pin_user_pages*() variations of get_user_pages*() calls,
and also pin_longterm_pages*() variations.

These variants all set FOLL_PIN, which is also introduced, and
thoroughly documented.

The pin_longterm*() variants also set FOLL_LONGTERM, in addition
to FOLL_PIN:

 pin_user_pages()
 pin_user_pages_remote()
 pin_user_pages_fast()

 pin_longterm_pages()
 pin_longterm_pages_remote()
 pin_longterm_pages_fast()


At some point in this conversation I thought we were going to put in "unpin_*"
versions of these.

Is that still in the plans?



Why yes it is! :)  Daniel Vetter and Jan Kara both already weighed in [1],
in favor of "unpin_user_page*()", rather than "put_user_page*()".

I'll change those names.

[1] https://lore.kernel.org/r/20191113101210.gd6...@quack2.suse.cz


thanks,
--
John Hubbard
NVIDIA


Re: [PATCH v4 09/23] mm/gup: introduce pin_user_pages*() and FOLL_PIN

2019-11-13 Thread Ira Weiny
On Tue, Nov 12, 2019 at 08:26:56PM -0800, John Hubbard wrote:
> Introduce pin_user_pages*() variations of get_user_pages*() calls,
> and also pin_longterm_pages*() variations.
> 
> These variants all set FOLL_PIN, which is also introduced, and
> thoroughly documented.
> 
> The pin_longterm*() variants also set FOLL_LONGTERM, in addition
> to FOLL_PIN:
> 
> pin_user_pages()
> pin_user_pages_remote()
> pin_user_pages_fast()
> 
> pin_longterm_pages()
> pin_longterm_pages_remote()
> pin_longterm_pages_fast()

At some point in this conversation I thought we were going to put in "unpin_*"
versions of these.

Is that still in the plans?

Ira



Re: [PATCH v4 09/23] mm/gup: introduce pin_user_pages*() and FOLL_PIN

2019-11-13 Thread Dan Williams
On Wed, Nov 13, 2019 at 2:43 AM Jan Kara  wrote:
>
> On Tue 12-11-19 20:26:56, John Hubbard wrote:
> > Introduce pin_user_pages*() variations of get_user_pages*() calls,
> > and also pin_longterm_pages*() variations.
> >
> > These variants all set FOLL_PIN, which is also introduced, and
> > thoroughly documented.
> >
> > The pin_longterm*() variants also set FOLL_LONGTERM, in addition
> > to FOLL_PIN:
> >
> > pin_user_pages()
> > pin_user_pages_remote()
> > pin_user_pages_fast()
> >
> > pin_longterm_pages()
> > pin_longterm_pages_remote()
> > pin_longterm_pages_fast()
> >
> > All pages that are pinned via the above calls, must be unpinned via
> > put_user_page().
> >
> > The underlying rules are:
> >
> > * These are gup-internal flags, so the call sites should not directly
> > set FOLL_PIN nor FOLL_LONGTERM. That behavior is enforced with
> > assertions, for the new FOLL_PIN flag. However, for the pre-existing
> > FOLL_LONGTERM flag, which has some call sites that still directly
> > set FOLL_LONGTERM, there is no assertion yet.
> >
> > * Call sites that want to indicate that they are going to do DirectIO
> >   ("DIO") or something with similar characteristics, should call a
> >   get_user_pages()-like wrapper call that sets FOLL_PIN. These wrappers
> >   will:
> > * Start with "pin_user_pages" instead of "get_user_pages". That
> >   makes it easy to find and audit the call sites.
> > * Set FOLL_PIN
> >
> > * For pages that are received via FOLL_PIN, those pages must be returned
> >   via put_user_page().
> >
> > Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases
> > in this documentation. (I've reworded it and expanded upon it.)
> >
> > Reviewed-by: Mike Rapoport   # Documentation
> > Reviewed-by: Jérôme Glisse 
> > Cc: Jonathan Corbet 
> > Cc: Ira Weiny 
> > Signed-off-by: John Hubbard 
>
> Thanks for the documentation. It looks great!
>
> > diff --git a/mm/gup.c b/mm/gup.c
> > index 83702b2e86c8..4409e84dff51 100644
> > --- a/mm/gup.c
> > +++ b/mm/gup.c
> > @@ -201,6 +201,10 @@ static struct page *follow_page_pte(struct 
> > vm_area_struct *vma,
> >   spinlock_t *ptl;
> >   pte_t *ptep, pte;
> >
> > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> > + if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) ==
> > +  (FOLL_PIN | FOLL_GET)))
> > + return ERR_PTR(-EINVAL);
> >  retry:
> >   if (unlikely(pmd_bad(*pmd)))
> >   return no_page_table(vma, flags);
>
> How does FOLL_PIN result in grabbing (at least normal, for now) page 
> reference?
> I didn't find that anywhere in this patch but it is a prerequisite to
> converting any user to pin_user_pages() interface, right?
>
> > +/**
> > + * pin_user_pages_fast() - pin user pages in memory without taking locks
> > + *
> > + * Nearly the same as get_user_pages_fast(), except that FOLL_PIN is set. 
> > See
> > + * get_user_pages_fast() for documentation on the function arguments, 
> > because
> > + * the arguments here are identical.
> > + *
> > + * FOLL_PIN means that the pages must be released via put_user_page(). 
> > Please
> > + * see Documentation/vm/pin_user_pages.rst for further details.
> > + *
> > + * This is intended for Case 1 (DIO) in 
> > Documentation/vm/pin_user_pages.rst. It
> > + * is NOT intended for Case 2 (RDMA: long-term pins).
> > + */
> > +int pin_user_pages_fast(unsigned long start, int nr_pages,
> > + unsigned int gup_flags, struct page **pages)
> > +{
> > + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> > + if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> > + return -EINVAL;
> > +
> > + gup_flags |= FOLL_PIN;
> > + return internal_get_user_pages_fast(start, nr_pages, gup_flags, 
> > pages);
> > +}
> > +EXPORT_SYMBOL_GPL(pin_user_pages_fast);
>
> I was somewhat wondering about the number of functions you add here. So we
> have:
>
> pin_user_pages()
> pin_user_pages_fast()
> pin_user_pages_remote()
>
> and then longterm variants:
>
> pin_longterm_pages()
> pin_longterm_pages_fast()
> pin_longterm_pages_remote()
>
> and obviously we have gup like:
> get_user_pages()
> get_user_pages_fast()
> get_user_pages_remote()
> ... and some other gup variants ...
>
> I think we really should have pin_* vs get_* variants as they are very
> different in terms of guarantees and after conversion, any use of get_*
> variant in non-mm code should be closely scrutinized. OTOH pin_longterm_*
> don't look *that* useful to me and just using pin_* instead with
> FOLL_LONGTERM flag would look OK to me and somewhat reduce the number of
> functions which is already large enough? What do people think? I don't feel
> too strongly about this but wanted to bring this up.

I'd vote for FOLL_LONGTERM should obviate the need for
{get,pin}_user_pages_longterm(). It's a property that is passed by the
call site, not an internal flag.


Re: [PATCH v4 09/23] mm/gup: introduce pin_user_pages*() and FOLL_PIN

2019-11-13 Thread Ira Weiny
> > +/**
> > + * pin_user_pages_fast() - pin user pages in memory without taking locks
> > + *
> > + * Nearly the same as get_user_pages_fast(), except that FOLL_PIN is set. 
> > See
> > + * get_user_pages_fast() for documentation on the function arguments, 
> > because
> > + * the arguments here are identical.
> > + *
> > + * FOLL_PIN means that the pages must be released via put_user_page(). 
> > Please
> > + * see Documentation/vm/pin_user_pages.rst for further details.
> > + *
> > + * This is intended for Case 1 (DIO) in 
> > Documentation/vm/pin_user_pages.rst. It
> > + * is NOT intended for Case 2 (RDMA: long-term pins).
> > + */
> > +int pin_user_pages_fast(unsigned long start, int nr_pages,
> > +   unsigned int gup_flags, struct page **pages)
> > +{
> > +   /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> > +   if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> > +   return -EINVAL;
> > +
> > +   gup_flags |= FOLL_PIN;
> > +   return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages);
> > +}
> > +EXPORT_SYMBOL_GPL(pin_user_pages_fast);
> 
> I was somewhat wondering about the number of functions you add here. So we
> have:
> 
> pin_user_pages()
> pin_user_pages_fast()
> pin_user_pages_remote()
> 
> and then longterm variants:
> 
> pin_longterm_pages()
> pin_longterm_pages_fast()
> pin_longterm_pages_remote()
> 
> and obviously we have gup like:
> get_user_pages()
> get_user_pages_fast()
> get_user_pages_remote()
> ... and some other gup variants ...
> 
> I think we really should have pin_* vs get_* variants as they are very
> different in terms of guarantees and after conversion, any use of get_*
> variant in non-mm code should be closely scrutinized. OTOH pin_longterm_*
> don't look *that* useful to me and just using pin_* instead with
> FOLL_LONGTERM flag would look OK to me and somewhat reduce the number of
> functions which is already large enough? What do people think? I don't feel
> too strongly about this but wanted to bring this up.

I'm a bit concerned with the function explosion myself.  I think what you
suggest is a happy medium.  So I'd be ok with that.

Ira



Re: [PATCH v4 09/23] mm/gup: introduce pin_user_pages*() and FOLL_PIN

2019-11-13 Thread Jan Kara
On Tue 12-11-19 20:26:56, John Hubbard wrote:
> Introduce pin_user_pages*() variations of get_user_pages*() calls,
> and also pin_longterm_pages*() variations.
> 
> These variants all set FOLL_PIN, which is also introduced, and
> thoroughly documented.
> 
> The pin_longterm*() variants also set FOLL_LONGTERM, in addition
> to FOLL_PIN:
> 
> pin_user_pages()
> pin_user_pages_remote()
> pin_user_pages_fast()
> 
> pin_longterm_pages()
> pin_longterm_pages_remote()
> pin_longterm_pages_fast()
> 
> All pages that are pinned via the above calls, must be unpinned via
> put_user_page().
> 
> The underlying rules are:
> 
> * These are gup-internal flags, so the call sites should not directly
> set FOLL_PIN nor FOLL_LONGTERM. That behavior is enforced with
> assertions, for the new FOLL_PIN flag. However, for the pre-existing
> FOLL_LONGTERM flag, which has some call sites that still directly
> set FOLL_LONGTERM, there is no assertion yet.
> 
> * Call sites that want to indicate that they are going to do DirectIO
>   ("DIO") or something with similar characteristics, should call a
>   get_user_pages()-like wrapper call that sets FOLL_PIN. These wrappers
>   will:
> * Start with "pin_user_pages" instead of "get_user_pages". That
>   makes it easy to find and audit the call sites.
> * Set FOLL_PIN
> 
> * For pages that are received via FOLL_PIN, those pages must be returned
>   via put_user_page().
> 
> Thanks to Jan Kara and Vlastimil Babka for explaining the 4 cases
> in this documentation. (I've reworded it and expanded upon it.)
> 
> Reviewed-by: Mike Rapoport   # Documentation
> Reviewed-by: Jérôme Glisse 
> Cc: Jonathan Corbet 
> Cc: Ira Weiny 
> Signed-off-by: John Hubbard 

Thanks for the documentation. It looks great!

> diff --git a/mm/gup.c b/mm/gup.c
> index 83702b2e86c8..4409e84dff51 100644
> --- a/mm/gup.c
> +++ b/mm/gup.c
> @@ -201,6 +201,10 @@ static struct page *follow_page_pte(struct 
> vm_area_struct *vma,
>   spinlock_t *ptl;
>   pte_t *ptep, pte;
>  
> + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> + if (WARN_ON_ONCE((flags & (FOLL_PIN | FOLL_GET)) ==
> +  (FOLL_PIN | FOLL_GET)))
> + return ERR_PTR(-EINVAL);
>  retry:
>   if (unlikely(pmd_bad(*pmd)))
>   return no_page_table(vma, flags);

How does FOLL_PIN result in grabbing (at least normal, for now) page reference?
I didn't find that anywhere in this patch but it is a prerequisite to
converting any user to pin_user_pages() interface, right?

> +/**
> + * pin_user_pages_fast() - pin user pages in memory without taking locks
> + *
> + * Nearly the same as get_user_pages_fast(), except that FOLL_PIN is set. See
> + * get_user_pages_fast() for documentation on the function arguments, because
> + * the arguments here are identical.
> + *
> + * FOLL_PIN means that the pages must be released via put_user_page(). Please
> + * see Documentation/vm/pin_user_pages.rst for further details.
> + *
> + * This is intended for Case 1 (DIO) in Documentation/vm/pin_user_pages.rst. 
> It
> + * is NOT intended for Case 2 (RDMA: long-term pins).
> + */
> +int pin_user_pages_fast(unsigned long start, int nr_pages,
> + unsigned int gup_flags, struct page **pages)
> +{
> + /* FOLL_GET and FOLL_PIN are mutually exclusive. */
> + if (WARN_ON_ONCE(gup_flags & FOLL_GET))
> + return -EINVAL;
> +
> + gup_flags |= FOLL_PIN;
> + return internal_get_user_pages_fast(start, nr_pages, gup_flags, pages);
> +}
> +EXPORT_SYMBOL_GPL(pin_user_pages_fast);

I was somewhat wondering about the number of functions you add here. So we
have:

pin_user_pages()
pin_user_pages_fast()
pin_user_pages_remote()

and then longterm variants:

pin_longterm_pages()
pin_longterm_pages_fast()
pin_longterm_pages_remote()

and obviously we have gup like:
get_user_pages()
get_user_pages_fast()
get_user_pages_remote()
... and some other gup variants ...

I think we really should have pin_* vs get_* variants as they are very
different in terms of guarantees and after conversion, any use of get_*
variant in non-mm code should be closely scrutinized. OTOH pin_longterm_*
don't look *that* useful to me and just using pin_* instead with
FOLL_LONGTERM flag would look OK to me and somewhat reduce the number of
functions which is already large enough? What do people think? I don't feel
too strongly about this but wanted to bring this up.

Honza



-- 
Jan Kara 
SUSE Labs, CR