Re: [PATCH v2 07/18] infiniband: set FOLL_PIN, FOLL_LONGTERM via pin_longterm_pages*()

2019-11-06 Thread Ira Weiny
On Mon, Nov 04, 2019 at 04:57:38PM -0400, Jason Gunthorpe wrote:
> On Mon, Nov 04, 2019 at 12:48:13PM -0800, John Hubbard wrote:
> > On 11/4/19 12:33 PM, Jason Gunthorpe wrote:
> > ...
> > >> diff --git a/drivers/infiniband/core/umem.c 
> > >> b/drivers/infiniband/core/umem.c
> > >> index 24244a2f68cc..c5a78d3e674b 100644
> > >> +++ b/drivers/infiniband/core/umem.c
> > >> @@ -272,11 +272,10 @@ struct ib_umem *ib_umem_get(struct ib_udata 
> > >> *udata, unsigned long addr,
> > >>  
> > >>  while (npages) {
> > >>  down_read(>mmap_sem);
> > >> -ret = get_user_pages(cur_base,
> > >> +ret = pin_longterm_pages(cur_base,
> > >>   min_t(unsigned long, npages,
> > >> PAGE_SIZE / sizeof (struct 
> > >> page *)),
> > >> - gup_flags | FOLL_LONGTERM,
> > >> - page_list, NULL);
> > >> + gup_flags, page_list, NULL);
> > > 
> > > FWIW, this one should be converted to fast as well, I think we finally
> > > got rid of all the blockers for that?
> > > 
> > 
> > I'm not aware of any blockers on the gup.c end, anyway. The only broken 
> > thing we
> > have there is "gup remote + FOLL_LONGTERM". But we can do "gup fast + 
> > LONGTERM". 
> 
> I mean the use of the mmap_sem here is finally in a way where we can
> just delete the mmap_sem and use _fast

Yay!  I agree if we can do this we should.

Thanks,
Ira

>  
> ie, AFAIK there is no need for the mmap_sem to be held during
> ib_umem_add_sg_table()
> 
> This should probably be a standalone patch however
> 
> Jason


Re: [PATCH v2 07/18] infiniband: set FOLL_PIN, FOLL_LONGTERM via pin_longterm_pages*()

2019-11-04 Thread Jason Gunthorpe
On Mon, Nov 04, 2019 at 02:03:43PM -0800, John Hubbard wrote:
> On 11/4/19 12:57 PM, Jason Gunthorpe wrote:
> > On Mon, Nov 04, 2019 at 12:48:13PM -0800, John Hubbard wrote:
> >> On 11/4/19 12:33 PM, Jason Gunthorpe wrote:
> >> ...
>  diff --git a/drivers/infiniband/core/umem.c 
>  b/drivers/infiniband/core/umem.c
>  index 24244a2f68cc..c5a78d3e674b 100644
>  +++ b/drivers/infiniband/core/umem.c
>  @@ -272,11 +272,10 @@ struct ib_umem *ib_umem_get(struct ib_udata 
>  *udata, unsigned long addr,
>   
>   while (npages) {
>   down_read(>mmap_sem);
>  -ret = get_user_pages(cur_base,
>  +ret = pin_longterm_pages(cur_base,
>    min_t(unsigned long, npages,
>  PAGE_SIZE / sizeof (struct 
>  page *)),
>  - gup_flags | FOLL_LONGTERM,
>  - page_list, NULL);
>  + gup_flags, page_list, NULL);
> >>>
> >>> FWIW, this one should be converted to fast as well, I think we finally
> >>> got rid of all the blockers for that?
> >>>
> >>
> >> I'm not aware of any blockers on the gup.c end, anyway. The only broken 
> >> thing we
> >> have there is "gup remote + FOLL_LONGTERM". But we can do "gup fast + 
> >> LONGTERM". 
> > 
> > I mean the use of the mmap_sem here is finally in a way where we can
> > just delete the mmap_sem and use _fast
> >  
> > ie, AFAIK there is no need for the mmap_sem to be held during
> > ib_umem_add_sg_table()
> > 
> > This should probably be a standalone patch however
> > 
> 
> Yes. Oh, actually I guess the patch flow should be: change to 
> get_user_pages_fast() and remove the mmap_sem calls, as one patch. And then 
> change 
> to pin_longterm_pages_fast() as the next patch. Otherwise, the internal 
> fallback
> from _fast to slow gup would attempt to take the mmap_sem (again) in the same
> thread, which is not good. :)
> 
> Or just defer the change until after this series. Either way is fine, let me
> know if you prefer one over the other.
> 
> The patch itself is trivial, but runtime testing to gain confidence that
> it's solid is much harder. Is there a stress test you would recommend for 
> that?
> (I'm not promising I can quickly run it yet--my local IB setup is still 
> nascent 
> at best.)

If you make a patch we can probably get it tested, it is something
we should do I keep forgetting about.

Jason


Re: [PATCH v2 07/18] infiniband: set FOLL_PIN, FOLL_LONGTERM via pin_longterm_pages*()

2019-11-04 Thread John Hubbard
On 11/4/19 12:57 PM, Jason Gunthorpe wrote:
> On Mon, Nov 04, 2019 at 12:48:13PM -0800, John Hubbard wrote:
>> On 11/4/19 12:33 PM, Jason Gunthorpe wrote:
>> ...
 diff --git a/drivers/infiniband/core/umem.c 
 b/drivers/infiniband/core/umem.c
 index 24244a2f68cc..c5a78d3e674b 100644
 +++ b/drivers/infiniband/core/umem.c
 @@ -272,11 +272,10 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, 
 unsigned long addr,
  
while (npages) {
down_read(>mmap_sem);
 -  ret = get_user_pages(cur_base,
 +  ret = pin_longterm_pages(cur_base,
 min_t(unsigned long, npages,
   PAGE_SIZE / sizeof (struct page *)),
 -   gup_flags | FOLL_LONGTERM,
 -   page_list, NULL);
 +   gup_flags, page_list, NULL);
>>>
>>> FWIW, this one should be converted to fast as well, I think we finally
>>> got rid of all the blockers for that?
>>>
>>
>> I'm not aware of any blockers on the gup.c end, anyway. The only broken 
>> thing we
>> have there is "gup remote + FOLL_LONGTERM". But we can do "gup fast + 
>> LONGTERM". 
> 
> I mean the use of the mmap_sem here is finally in a way where we can
> just delete the mmap_sem and use _fast
>  
> ie, AFAIK there is no need for the mmap_sem to be held during
> ib_umem_add_sg_table()
> 
> This should probably be a standalone patch however
> 

Yes. Oh, actually I guess the patch flow should be: change to 
get_user_pages_fast() and remove the mmap_sem calls, as one patch. And then 
change 
to pin_longterm_pages_fast() as the next patch. Otherwise, the internal fallback
from _fast to slow gup would attempt to take the mmap_sem (again) in the same
thread, which is not good. :)

Or just defer the change until after this series. Either way is fine, let me
know if you prefer one over the other.

The patch itself is trivial, but runtime testing to gain confidence that
it's solid is much harder. Is there a stress test you would recommend for that?
(I'm not promising I can quickly run it yet--my local IB setup is still nascent 
at best.)


thanks,
-- 
John Hubbard
NVIDIA



Re: [PATCH v2 07/18] infiniband: set FOLL_PIN, FOLL_LONGTERM via pin_longterm_pages*()

2019-11-04 Thread Jason Gunthorpe
On Mon, Nov 04, 2019 at 12:48:13PM -0800, John Hubbard wrote:
> On 11/4/19 12:33 PM, Jason Gunthorpe wrote:
> ...
> >> diff --git a/drivers/infiniband/core/umem.c 
> >> b/drivers/infiniband/core/umem.c
> >> index 24244a2f68cc..c5a78d3e674b 100644
> >> +++ b/drivers/infiniband/core/umem.c
> >> @@ -272,11 +272,10 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, 
> >> unsigned long addr,
> >>  
> >>while (npages) {
> >>down_read(>mmap_sem);
> >> -  ret = get_user_pages(cur_base,
> >> +  ret = pin_longterm_pages(cur_base,
> >> min_t(unsigned long, npages,
> >>   PAGE_SIZE / sizeof (struct page *)),
> >> -   gup_flags | FOLL_LONGTERM,
> >> -   page_list, NULL);
> >> +   gup_flags, page_list, NULL);
> > 
> > FWIW, this one should be converted to fast as well, I think we finally
> > got rid of all the blockers for that?
> > 
> 
> I'm not aware of any blockers on the gup.c end, anyway. The only broken thing 
> we
> have there is "gup remote + FOLL_LONGTERM". But we can do "gup fast + 
> LONGTERM". 

I mean the use of the mmap_sem here is finally in a way where we can
just delete the mmap_sem and use _fast
 
ie, AFAIK there is no need for the mmap_sem to be held during
ib_umem_add_sg_table()

This should probably be a standalone patch however

Jason


Re: [PATCH v2 07/18] infiniband: set FOLL_PIN, FOLL_LONGTERM via pin_longterm_pages*()

2019-11-04 Thread John Hubbard
On 11/4/19 12:33 PM, Jason Gunthorpe wrote:
...
>> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
>> index 24244a2f68cc..c5a78d3e674b 100644
>> +++ b/drivers/infiniband/core/umem.c
>> @@ -272,11 +272,10 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, 
>> unsigned long addr,
>>  
>>  while (npages) {
>>  down_read(>mmap_sem);
>> -ret = get_user_pages(cur_base,
>> +ret = pin_longterm_pages(cur_base,
>>   min_t(unsigned long, npages,
>> PAGE_SIZE / sizeof (struct page *)),
>> - gup_flags | FOLL_LONGTERM,
>> - page_list, NULL);
>> + gup_flags, page_list, NULL);
> 
> FWIW, this one should be converted to fast as well, I think we finally
> got rid of all the blockers for that?
> 

I'm not aware of any blockers on the gup.c end, anyway. The only broken thing we
have there is "gup remote + FOLL_LONGTERM". But we can do "gup fast + 
LONGTERM". 

Unless I'm really missing something, in which case several other call sites
would need changes.

I'll change it to pin_longterm_pages_fast().

thanks,

John Hubbard
NVIDIA


Re: [PATCH v2 07/18] infiniband: set FOLL_PIN, FOLL_LONGTERM via pin_longterm_pages*()

2019-11-04 Thread Jason Gunthorpe
On Sun, Nov 03, 2019 at 01:18:02PM -0800, John Hubbard wrote:
> Convert infiniband to use the new wrapper calls, and stop
> explicitly setting FOLL_LONGTERM at the call sites.
> 
> The new pin_longterm_*() calls replace get_user_pages*()
> calls, and set both FOLL_LONGTERM and a new FOLL_PIN
> flag. The FOLL_PIN flag requires that the caller must
> return the pages via put_user_page*() calls, but
> infiniband was already doing that as part of an earlier
> commit.
> 
> Reviewed-by: Ira Weiny 
> Signed-off-by: John Hubbard 
>  drivers/infiniband/core/umem.c  |  5 ++---
>  drivers/infiniband/core/umem_odp.c  | 10 +-
>  drivers/infiniband/hw/hfi1/user_pages.c |  4 ++--
>  drivers/infiniband/hw/mthca/mthca_memfree.c |  3 +--
>  drivers/infiniband/hw/qib/qib_user_pages.c  |  8 
>  drivers/infiniband/hw/qib/qib_user_sdma.c   |  2 +-
>  drivers/infiniband/hw/usnic/usnic_uiom.c|  9 -
>  drivers/infiniband/sw/siw/siw_mem.c |  5 ++---
>  8 files changed, 21 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index 24244a2f68cc..c5a78d3e674b 100644
> +++ b/drivers/infiniband/core/umem.c
> @@ -272,11 +272,10 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, 
> unsigned long addr,
>  
>   while (npages) {
>   down_read(>mmap_sem);
> - ret = get_user_pages(cur_base,
> + ret = pin_longterm_pages(cur_base,
>min_t(unsigned long, npages,
>  PAGE_SIZE / sizeof (struct page *)),
> -  gup_flags | FOLL_LONGTERM,
> -  page_list, NULL);
> +  gup_flags, page_list, NULL);

FWIW, this one should be converted to fast as well, I think we finally
got rid of all the blockers for that?

Jason