Re: [PATCH v1 3/5] mm/memory_hotplug: make offline_and_remove_memory() timeout instead of failing on fatal signals

2023-06-27 Thread John Hubbard via Virtualization

On 6/27/23 08:14, Michal Hocko wrote:

On Tue 27-06-23 16:57:53, David Hildenbrand wrote:

...

IIUC (John can correct me if I am wrong):

1) The process holds the device node open
2) The process gets killed or quits
3) As the process gets torn down, it closes the device node
4) Closing the device node results in the driver removing the device and
 calling offline_and_remove_memory()

So it's not a "tear down process" that triggers that offlining_removal
somehow explicitly, it's just a side-product of it letting go of the device
node as the process gets torn down.


Isn't that just fragile? The operation might fail for other reasons. Why
cannot there be a hold on the resource to control the tear down
explicitly?


I'll let John comment on that. But from what I understood, in most setups
where ZONE_MOVABLE gets used for hotplugged memory
offline_and_remove_memory() succeeds and allows for reusing the device later
without a reboot.

For the cases where it doesn't work, a reboot is required.
 
That is exactly correct. That's what we ran into.


And there are workarounds (for example: kthreads don't have any signals
pending...), but I did want to follow through here and make -mm aware of the
problem. And see if there is a better way.

...

It seems that offline_and_remove_memory is using a wrong operation then.
If it wants an opportunistic offlining with some sort of policy. Timeout
might be just one policy to use but failure mode or a retry count might
be a better fit for some users. So rather than (ab)using offline_pages,
would be make more sense to extract basic offlining steps and allow
drivers like virtio-mem to reuse them and define their own policy?


...like this, perhaps. Sounds promising!


thanks,
--
John Hubbard
NVIDIA

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5"

2020-06-12 Thread John Hubbard

On 2020-06-12 12:24, Matthew Wilcox wrote:

On Fri, May 29, 2020 at 04:43:08PM -0700, John Hubbard wrote:

+CASE 5: Pinning in order to write to the data within the page
+-
+Even though neither DMA nor Direct IO is involved, just a simple case of "pin,
+access page's data, unpin" can cause a problem. Case 5 may be considered a
+superset of Case 1, plus Case 2, plus anything that invokes that pattern. In
+other words, if the code is neither Case 1 nor Case 2, it may still require
+FOLL_PIN, for patterns like this:
+
+Correct (uses FOLL_PIN calls):
+pin_user_pages()
+access the data within the pages
+set_page_dirty_lock()
+unpin_user_pages()
+
+INCORRECT (uses FOLL_GET calls):
+get_user_pages()
+access the data within the pages
+set_page_dirty_lock()
+put_page()


Why does this case need to pin?  Why can't it just do ...

get_user_pages()
lock_page(page);
... modify the data ...
set_page_dirty(page);
unlock_page(page);



Yes, it could do that. And that would also make a good additional "correct"
example. Especially for the case of just dealing with a single page,
lock_page() has the benefit of completely fixing the problem *today*,
without waiting for the pin_user_pages*() handling improvements to get
implemented.

And it's also another (probably better) way to fix the vhost.c problem, than
commit 690623e1b496 ("vhost: convert get_user_pages() --> pin_user_pages()").

I'm inclined to leave vhost.c alone for now, unless someone really prefers
it to be changed, but to update the Case 5 documentation with your point
above. Sound about right?


thanks,
--
John Hubbard
NVIDIA
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 2/2] vhost: convert get_user_pages() --> pin_user_pages()

2020-05-31 Thread John Hubbard
This code was using get_user_pages*(), in approximately a "Case 5"
scenario (accessing the data within a page), using the categorization
from [1]. That means that it's time to convert the get_user_pages*() +
put_page() calls to pin_user_pages*() + unpin_user_pages() calls.

There is some helpful background in [2]: basically, this is a small
part of fixing a long-standing disconnect between pinning pages, and
file systems' use of those pages.

[1] Documentation/core-api/pin_user_pages.rst

[2] "Explicit pinning of user-space pages":
https://lwn.net/Articles/807108/

Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Cc: k...@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: net...@vger.kernel.org
Signed-off-by: John Hubbard 
---
 drivers/vhost/vhost.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 21a59b598ed8..596132a96cd5 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1762,15 +1762,14 @@ static int set_bit_to_user(int nr, void __user *addr)
int bit = nr + (log % PAGE_SIZE) * 8;
int r;
 
-   r = get_user_pages_fast(log, 1, FOLL_WRITE, );
+   r = pin_user_pages_fast(log, 1, FOLL_WRITE, );
if (r < 0)
return r;
BUG_ON(r != 1);
base = kmap_atomic(page);
set_bit(bit, base);
kunmap_atomic(base);
-   set_page_dirty_lock(page);
-   put_page(page);
+   unpin_user_pages_dirty_lock(, 1, true);
return 0;
 }
 
-- 
2.26.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 0/2] vhost, docs: convert to pin_user_pages(), new "case 5"

2020-05-31 Thread John Hubbard
This is based on Linux 5.7, plus one prerequisite patch:
   "mm/gup: update pin_user_pages.rst for "case 3" (mmu notifiers)" [1]

Changes since v1: removed references to set_page_dirty*(), in response to
Souptick Joarder's review (thanks!).

Cover letter for v1, edited/updated slightly:

It recently became clear to me that there are some get_user_pages*()
callers that don't fit neatly into any of the four cases that are so
far listed in pin_user_pages.rst. vhost.c is one of those.

Add a Case 5 to the documentation, and refer to that when converting
vhost.c.

Thanks to Jan Kara for helping me (again) in understanding the
interaction between get_user_pages() and page writeback [2].

Note that I have only compile-tested the vhost.c patch, although that
does also include cross-compiling for a few other arches. Any run-time
testing would be greatly appreciated.

[1] https://lore.kernel.org/r/20200527194953.11130-1-jhubb...@nvidia.com
[2] https://lore.kernel.org/r/20200529070343.gl14...@quack2.suse.cz

John Hubbard (2):
  docs: mm/gup: pin_user_pages.rst: add a "case 5"
  vhost: convert get_user_pages() --> pin_user_pages()

 Documentation/core-api/pin_user_pages.rst | 18 ++
 drivers/vhost/vhost.c |  5 ++---
 2 files changed, 20 insertions(+), 3 deletions(-)


base-commit: 3d77e6a8804abcc0504c904bd6e5cdf3a5cf8162
-- 
2.26.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5"

2020-05-31 Thread John Hubbard
There are four cases listed in pin_user_pages.rst. These are
intended to help developers figure out whether to use
get_user_pages*(), or pin_user_pages*(). However, the four cases
do not cover all the situations. For example, drivers/vhost/vhost.c
has a "pin, write to page, set page dirty, unpin" case.

Add a fifth case, to help explain that there is a general pattern
that requires pin_user_pages*() API calls.

Cc: Vlastimil Babka 
Cc: Jan Kara 
Cc: Jérôme Glisse 
Cc: Dave Chinner 
Cc: Jonathan Corbet 
Cc: linux-...@vger.kernel.org
Cc: linux-fsde...@vger.kernel.org
Signed-off-by: John Hubbard 
---
 Documentation/core-api/pin_user_pages.rst | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/Documentation/core-api/pin_user_pages.rst 
b/Documentation/core-api/pin_user_pages.rst
index 4675b04e8829..6068266dd303 100644
--- a/Documentation/core-api/pin_user_pages.rst
+++ b/Documentation/core-api/pin_user_pages.rst
@@ -171,6 +171,24 @@ If only struct page data (as opposed to the actual memory 
contents that a page
 is tracking) is affected, then normal GUP calls are sufficient, and neither 
flag
 needs to be set.
 
+CASE 5: Pinning in order to write to the data within the page
+-
+Even though neither DMA nor Direct IO is involved, just a simple case of "pin,
+write to a page's data, unpin" can cause a problem. Case 5 may be considered a
+superset of Case 1, plus Case 2, plus anything that invokes that pattern. In
+other words, if the code is neither Case 1 nor Case 2, it may still require
+FOLL_PIN, for patterns like this:
+
+Correct (uses FOLL_PIN calls):
+pin_user_pages()
+write to the data within the pages
+unpin_user_pages()
+
+INCORRECT (uses FOLL_GET calls):
+get_user_pages()
+write to the data within the pages
+put_page()
+
 page_maybe_dma_pinned(): the whole point of pinning
 ===
 
-- 
2.26.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5"

2020-05-31 Thread John Hubbard

On 2020-05-31 00:11, Souptick Joarder wrote:
...

diff --git a/Documentation/core-api/pin_user_pages.rst 
b/Documentation/core-api/pin_user_pages.rst
index 4675b04e8829..b9f2688a2c67 100644
--- a/Documentation/core-api/pin_user_pages.rst
+++ b/Documentation/core-api/pin_user_pages.rst
@@ -171,6 +171,26 @@ If only struct page data (as opposed to the actual memory 
contents that a page
  is tracking) is affected, then normal GUP calls are sufficient, and neither 
flag
  needs to be set.

+CASE 5: Pinning in order to write to the data within the page
+-
+Even though neither DMA nor Direct IO is involved, just a simple case of "pin,
+access page's data, unpin" can cause a problem.


Will it be, *"pin, access page's data, set page dirty, unpin" * ?


Well...the problem can show up with just accessing (writing) the data.
But it is true that this statement is a little different from the
patterns below, which is confusing. I'll delete set_page_dirty() from each
of them, in order to avoid confusing things. (Although each is correct.)
And I'll also change the above to "pin, write to a page's data, upin".

set_page_dirty() interactions are really just extra credit here. :) And
fully read-only situations won't cause a problem.



Case 5 may be considered a

+superset of Case 1, plus Case 2, plus anything that invokes that pattern. In
+other words, if the code is neither Case 1 nor Case 2, it may still require
+FOLL_PIN, for patterns like this:
+
+Correct (uses FOLL_PIN calls):
+pin_user_pages()
+access the data within the pages
+set_page_dirty_lock()
+unpin_user_pages()
+
+INCORRECT (uses FOLL_GET calls):
+get_user_pages()
+access the data within the pages
+set_page_dirty_lock()
+put_page()
+


I'll send a v2 shortly.

thanks,
--
John Hubbard
NVIDIA
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 2/2] vhost: convert get_user_pages() --> pin_user_pages()

2020-05-29 Thread John Hubbard
This code was using get_user_pages*(), in approximately a "Case 5"
scenario (accessing the data within a page), using the categorization
from [1]. That means that it's time to convert the get_user_pages*() +
put_page() calls to pin_user_pages*() + unpin_user_pages() calls.

There is some helpful background in [2]: basically, this is a small
part of fixing a long-standing disconnect between pinning pages, and
file systems' use of those pages.

[1] Documentation/core-api/pin_user_pages.rst

[2] "Explicit pinning of user-space pages":
https://lwn.net/Articles/807108/

Cc: Michael S. Tsirkin 
Cc: Jason Wang 
Cc: k...@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: net...@vger.kernel.org
Signed-off-by: John Hubbard 
---
 drivers/vhost/vhost.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 21a59b598ed8..596132a96cd5 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1762,15 +1762,14 @@ static int set_bit_to_user(int nr, void __user *addr)
int bit = nr + (log % PAGE_SIZE) * 8;
int r;
 
-   r = get_user_pages_fast(log, 1, FOLL_WRITE, );
+   r = pin_user_pages_fast(log, 1, FOLL_WRITE, );
if (r < 0)
return r;
BUG_ON(r != 1);
base = kmap_atomic(page);
set_bit(bit, base);
kunmap_atomic(base);
-   set_page_dirty_lock(page);
-   put_page(page);
+   unpin_user_pages_dirty_lock(, 1, true);
return 0;
 }
 
-- 
2.26.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 1/2] docs: mm/gup: pin_user_pages.rst: add a "case 5"

2020-05-29 Thread John Hubbard
There are four cases listed in pin_user_pages.rst. These are
intended to help developers figure out whether to use
get_user_pages*(), or pin_user_pages*(). However, the four cases
do not cover all the situations. For example, drivers/vhost/vhost.c
has a "pin, write to page, set page dirty, unpin" case.

Add a fifth case, to help explain that there is a general pattern
that requires pin_user_pages*() API calls.

Cc: Vlastimil Babka 
Cc: Jan Kara 
Cc: Jérôme Glisse 
Cc: Dave Chinner 
Cc: Jonathan Corbet 
Cc: linux-...@vger.kernel.org
Cc: linux-fsde...@vger.kernel.org
Signed-off-by: John Hubbard 
---
 Documentation/core-api/pin_user_pages.rst | 20 
 1 file changed, 20 insertions(+)

diff --git a/Documentation/core-api/pin_user_pages.rst 
b/Documentation/core-api/pin_user_pages.rst
index 4675b04e8829..b9f2688a2c67 100644
--- a/Documentation/core-api/pin_user_pages.rst
+++ b/Documentation/core-api/pin_user_pages.rst
@@ -171,6 +171,26 @@ If only struct page data (as opposed to the actual memory 
contents that a page
 is tracking) is affected, then normal GUP calls are sufficient, and neither 
flag
 needs to be set.
 
+CASE 5: Pinning in order to write to the data within the page
+-
+Even though neither DMA nor Direct IO is involved, just a simple case of "pin,
+access page's data, unpin" can cause a problem. Case 5 may be considered a
+superset of Case 1, plus Case 2, plus anything that invokes that pattern. In
+other words, if the code is neither Case 1 nor Case 2, it may still require
+FOLL_PIN, for patterns like this:
+
+Correct (uses FOLL_PIN calls):
+pin_user_pages()
+access the data within the pages
+set_page_dirty_lock()
+unpin_user_pages()
+
+INCORRECT (uses FOLL_GET calls):
+get_user_pages()
+access the data within the pages
+set_page_dirty_lock()
+put_page()
+
 page_maybe_dma_pinned(): the whole point of pinning
 ===
 
-- 
2.26.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 0/2] vhost, docs: convert to pin_user_pages(), new "case 5"

2020-05-29 Thread John Hubbard
Hi,

It recently became clear to me that there are some get_user_pages*()
callers that don't fit neatly into any of the four cases that are so
far listed in pin_user_pages.rst. vhost.c is one of those.

Add a Case 5 to the documentation, and refer to that when converting
vhost.c.

Thanks to Jan Kara for helping me (again) in understanding the
interaction between get_user_pages() and page writeback [1].

This is based on today's mmotm, which has a nearby patch to
pin_user_pages.rst that rewords cases 3 and 4.

Note that I have only compile-tested the vhost.c patch, although that
does also include cross-compiling for a few other arches. Any run-time
testing would be greatly appreciated.

[1] https://lore.kernel.org/r/20200529070343.gl14...@quack2.suse.cz

John Hubbard (2):
  docs: mm/gup: pin_user_pages.rst: add a "case 5"
  vhost: convert get_user_pages() --> pin_user_pages()

 Documentation/core-api/pin_user_pages.rst | 20 
 drivers/vhost/vhost.c |  5 ++---
 2 files changed, 22 insertions(+), 3 deletions(-)

-- 
2.26.2

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()

2019-08-07 Thread John Hubbard

On 8/6/19 11:34 PM, Christoph Hellwig wrote:

On Mon, Aug 05, 2019 at 03:54:35PM -0700, John Hubbard wrote:

On 7/23/19 11:17 PM, Christoph Hellwig wrote:

...

I think we can do this in a simple and better way.  We have 5 ITER_*
types.  Of those ITER_DISCARD as the name suggests never uses pages, so
we can skip handling it.  ITER_PIPE is rejected іn the direct I/O path,
which leaves us with three.



Hi Christoph,

Are you working on anything like this?


I was hoping I could steer you towards it.  But if you don't want to do
it yourself I'll add it to my ever growing todo list.



Sure, I'm up for this. The bvec-related items are the next logical part
of the gup/dma conversions to work on, and I just wanted to avoid solving the
same problem if you were already in the code.



Or on the put_user_bvec() idea?


I have a prototype from two month ago:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/gup-bvec

but that only survived the most basic testing, so it'll need more work,
which I'm not sure when I'll find time for.



I'll take a peek, and probably pester you with a few questions if I get
confused. :)

thanks,
--
John Hubbard
NVIDIA
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()

2019-08-05 Thread John Hubbard
On 7/23/19 11:17 PM, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 09:25:06PM -0700, john.hubb...@gmail.com wrote:
>> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
>>   Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
>>   it is time to release the pages. That allows choosing between put_page()
>>   and put_user_page*().
>>
>> * Pass in one more piece of information to bio_release_pages: a "from_gup"
>>   parameter. Similar use as above.
>>
>> * Change the block layer, and several file systems, to use
>>   put_user_page*().
> 
> I think we can do this in a simple and better way.  We have 5 ITER_*
> types.  Of those ITER_DISCARD as the name suggests never uses pages, so
> we can skip handling it.  ITER_PIPE is rejected іn the direct I/O path,
> which leaves us with three.
> 

Hi Christoph,

Are you working on anything like this? Or on the put_user_bvec() idea?
Please let me know, otherwise I'll go in and implement something here.


thanks,
-- 
John Hubbard
NVIDIA

> Out of those ITER_BVEC needs a user page reference, so we want to call
> put_user_page* on it.  ITER_BVEC always already has page reference,
> which means in the block direct I/O path path we alread don't take
> a page reference.  We should extent that handling to all other calls
> of iov_iter_get_pages / iov_iter_get_pages_alloc.  I think we should
> just reject ITER_KVEC for direct I/O as well as we have no users and
> it is rather pointless.  Alternatively if we see a use for it the
> callers should always have a life page reference anyway (or might
> be on kmalloc memory), so we really should not take a reference either.
> 
> In other words:  the only time we should ever have to put a page in
> this patch is when they are user pages.  We'll need to clean up
> various bits of code for that, but that can be done gradually before
> even getting to the actual put_user_pages conversion.
> 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()

2019-07-25 Thread John Hubbard
On 7/24/19 5:41 PM, Bob Liu wrote:
> On 7/24/19 12:25 PM, john.hubb...@gmail.com wrote:
>> From: John Hubbard 
>>
>> Hi,
>>
>> This is mostly Jerome's work, converting the block/bio and related areas
>> to call put_user_page*() instead of put_page(). Because I've changed
>> Jerome's patches, in some cases significantly, I'd like to get his
>> feedback before we actually leave him listed as the author (he might
>> want to disown some or all of these).
>>
> 
> Could you add some background to the commit log for people don't have the 
> context..
> Why this converting? What's the main differences?
> 

Hi Bob,

1. Many of the patches have a blurb like this:

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

...and if you look at that commit, you'll find several pages of
information in its commit description, which should address your point.

2. This whole series has to be re-worked, as per the other feedback thread.
So I'll keep your comment in mind when I post a new series.

thanks,
-- 
John Hubbard
NVIDIA
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()

2019-07-24 Thread John Hubbard
On 7/23/19 11:17 PM, Christoph Hellwig wrote:
> On Tue, Jul 23, 2019 at 09:25:06PM -0700, john.hubb...@gmail.com wrote:
>> * Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
>>   Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
>>   it is time to release the pages. That allows choosing between put_page()
>>   and put_user_page*().
>>
>> * Pass in one more piece of information to bio_release_pages: a "from_gup"
>>   parameter. Similar use as above.
>>
>> * Change the block layer, and several file systems, to use
>>   put_user_page*().
> 
> I think we can do this in a simple and better way.  We have 5 ITER_*
> types.  Of those ITER_DISCARD as the name suggests never uses pages, so
> we can skip handling it.  ITER_PIPE is rejected іn the direct I/O path,
> which leaves us with three.
> 
> Out of those ITER_BVEC needs a user page reference, so we want to call

   ^ ITER_IOVEC, I hope. Otherwise I'm hopeless lost. :)

> put_user_page* on it.  ITER_BVEC always already has page reference,
> which means in the block direct I/O path path we alread don't take
> a page reference.  We should extent that handling to all other calls
> of iov_iter_get_pages / iov_iter_get_pages_alloc.  I think we should
> just reject ITER_KVEC for direct I/O as well as we have no users and
> it is rather pointless.  Alternatively if we see a use for it the
> callers should always have a life page reference anyway (or might
> be on kmalloc memory), so we really should not take a reference either.
> 
> In other words:  the only time we should ever have to put a page in
> this patch is when they are user pages.  We'll need to clean up
> various bits of code for that, but that can be done gradually before
> even getting to the actual put_user_pages conversion.
> 

Sounds great. I'm part way into it and it doesn't look too bad. The main
question is where to scatter various checks and assertions, to keep
the kvecs out of direct I/0. Or at least keep the gups away from 
direct I/0.


thanks,
-- 
John Hubbard
NVIDIA
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 07/12] vhost-scsi: convert put_page() to put_user_page*()

2019-07-24 Thread John Hubbard
On 7/23/19 9:25 PM, john.hubb...@gmail.com wrote:
> From: Jérôme Glisse 
> 
> For pages that were retained via get_user_pages*(), release those pages
> via the new put_user_page*() routines, instead of via put_page().
> 
> This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
> ("mm: introduce put_user_page*(), placeholder versions").
> 
> Changes from Jérôme's original patch:
> 
> * Changed a WARN_ON to a BUG_ON.
> 

Clearly, the above commit log has it backwards (this is quite my night
for typos).  Please read that as "changed a BUG_ON to a WARN_ON".

I'll correct the commit description in next iteration of this patchset.

...

> + /*
> +  * Here in all cases we should have an IOVEC which use GUP. If that is
> +  * not the case then we will wrongly call put_user_page() and the page
> +  * refcount will go wrong (this is in vhost_scsi_release_cmd())
> +  */
> + WARN_ON(!iov_iter_get_pages_use_gup(iter));
> +
...

thanks,
-- 
John Hubbard
NVIDIA
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 12/12] fs/ceph: fix a build warning: returning a value from void function

2019-07-24 Thread john . hubbard
From: John Hubbard 

Trivial build warning fix: don't return a value from a function
whose type is "void".

Signed-off-by: John Hubbard 
---
 fs/ceph/debugfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c
index 2eb88ed22993..fa14c8e8761d 100644
--- a/fs/ceph/debugfs.c
+++ b/fs/ceph/debugfs.c
@@ -294,7 +294,7 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc)
 
 void ceph_fs_debugfs_init(struct ceph_fs_client *fsc)
 {
-   return 0;
+   return;
 }
 
 void ceph_fs_debugfs_cleanup(struct ceph_fs_client *fsc)
-- 
2.22.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 11/12] 9p/net: convert put_page() to put_user_page*()

2019-07-24 Thread john . hubbard
From: Jérôme Glisse 

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Signed-off-by: Jérôme Glisse 
Signed-off-by: John Hubbard 
Cc: linux-fsde...@vger.kernel.org
Cc: linux-bl...@vger.kernel.org
Cc: linux...@kvack.org
Cc: v9fs-develo...@lists.sourceforge.net
Cc: Jan Kara 
Cc: Dan Williams 
Cc: Alexander Viro 
Cc: Johannes Thumshirn 
Cc: Christoph Hellwig 
Cc: Jens Axboe 
Cc: Ming Lei 
Cc: Dave Chinner 
Cc: Jason Gunthorpe 
Cc: Matthew Wilcox 
Cc: Boaz Harrosh 
Cc: Eric Van Hensbergen 
Cc: Latchesar Ionkov 
Cc: Dominique Martinet 
---
 net/9p/trans_common.c | 14 ++
 net/9p/trans_common.h |  3 ++-
 net/9p/trans_virtio.c | 18 +-
 3 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/net/9p/trans_common.c b/net/9p/trans_common.c
index 3dff68f05fb9..e5c359c369a6 100644
--- a/net/9p/trans_common.c
+++ b/net/9p/trans_common.c
@@ -19,12 +19,18 @@
 /**
  *  p9_release_pages - Release pages after the transaction.
  */
-void p9_release_pages(struct page **pages, int nr_pages)
+void p9_release_pages(struct page **pages, int nr_pages, bool from_gup)
 {
int i;
 
-   for (i = 0; i < nr_pages; i++)
-   if (pages[i])
-   put_page(pages[i]);
+   if (from_gup) {
+   for (i = 0; i < nr_pages; i++)
+   if (pages[i])
+   put_user_page(pages[i]);
+   } else {
+   for (i = 0; i < nr_pages; i++)
+   if (pages[i])
+   put_page(pages[i]);
+   }
 }
 EXPORT_SYMBOL(p9_release_pages);
diff --git a/net/9p/trans_common.h b/net/9p/trans_common.h
index c43babb3f635..dcf025867314 100644
--- a/net/9p/trans_common.h
+++ b/net/9p/trans_common.h
@@ -12,4 +12,5 @@
  *
  */
 
-void p9_release_pages(struct page **, int);
+void p9_release_pages(struct page **pages, int nr_pages, bool from_gup);
+
diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
index a3cd90a74012..3714ca5ecdc2 100644
--- a/net/9p/trans_virtio.c
+++ b/net/9p/trans_virtio.c
@@ -306,11 +306,14 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
   struct iov_iter *data,
   int count,
   size_t *offs,
-  int *need_drop)
+  int *need_drop,
+  bool *from_gup)
 {
int nr_pages;
int err;
 
+   *from_gup = false;
+
if (!iov_iter_count(data))
return 0;
 
@@ -332,6 +335,7 @@ static int p9_get_mapped_pages(struct virtio_chan *chan,
*need_drop = 1;
nr_pages = DIV_ROUND_UP(n + *offs, PAGE_SIZE);
atomic_add(nr_pages, _pinned);
+   *from_gup = iov_iter_get_pages_use_gup(data);
return n;
} else {
/* kernel buffer, no need to pin pages */
@@ -397,13 +401,15 @@ p9_virtio_zc_request(struct p9_client *client, struct 
p9_req_t *req,
size_t offs;
int need_drop = 0;
int kicked = 0;
+   bool in_from_gup, out_from_gup;
 
p9_debug(P9_DEBUG_TRANS, "virtio request\n");
 
if (uodata) {
__le32 sz;
int n = p9_get_mapped_pages(chan, _pages, uodata,
-   outlen, , _drop);
+   outlen, , _drop,
+   _from_gup);
if (n < 0) {
err = n;
goto err_out;
@@ -422,7 +428,8 @@ p9_virtio_zc_request(struct p9_client *client, struct 
p9_req_t *req,
memcpy(>tc.sdata[0], , sizeof(sz));
} else if (uidata) {
int n = p9_get_mapped_pages(chan, _pages, uidata,
-   inlen, , _drop);
+   inlen, , _drop,
+   _from_gup);
if (n < 0) {
err = n;
goto err_out;
@@ -504,11 +511,12 @@ p9_virtio_zc_request(struct p9_client *client, struct 
p9_req_t *req,
 err_out:
if (need_drop) {
if (in_pages) {
-   p9_release_pages(in_pages, in_nr_pages);
+   p9_release_pages(in_pages, in_nr_pages, in_from_gup);
atomic_sub(in_nr_pages, _pinned);
}
if (out_pages) {
-   p9_release_pages(out_pages, out_nr_pages);
+   p9_release_pages(out_pages, out_nr_pages,
+out_from_gup);
  

[PATCH 10/12] fs/ceph: convert put_page() to put_user_page*()

2019-07-24 Thread john . hubbard
From: Jérôme Glisse 

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Changes from Jérôme's original patch:

* Use the enhanced put_user_pages_dirty_lock().

Signed-off-by: Jérôme Glisse 
Signed-off-by: John Hubbard 
Cc: linux-fsde...@vger.kernel.org
Cc: linux-bl...@vger.kernel.org
Cc: linux...@kvack.org
Cc: ceph-de...@vger.kernel.org
Cc: Jan Kara 
Cc: Dan Williams 
Cc: Alexander Viro 
Cc: Johannes Thumshirn 
Cc: Christoph Hellwig 
Cc: Jens Axboe 
Cc: Ming Lei 
Cc: Dave Chinner 
Cc: Jason Gunthorpe 
Cc: Matthew Wilcox 
Cc: Boaz Harrosh 
Cc: "Yan, Zheng" 
Cc: Sage Weil 
Cc: Ilya Dryomov 
---
 fs/ceph/file.c | 62 ++
 1 file changed, 48 insertions(+), 14 deletions(-)

diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 685a03cc4b77..c628a1f96978 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -158,18 +158,26 @@ static ssize_t iter_get_bvecs_alloc(struct iov_iter 
*iter, size_t maxsize,
return bytes;
 }
 
-static void put_bvecs(struct bio_vec *bvecs, int num_bvecs, bool should_dirty)
+static void put_bvecs(struct bio_vec *bv, int num_bvecs, bool should_dirty,
+ bool from_gup)
 {
int i;
 
+
for (i = 0; i < num_bvecs; i++) {
-   if (bvecs[i].bv_page) {
+   if (!bv[i].bv_page)
+   continue;
+
+   if (from_gup) {
+   put_user_pages_dirty_lock([i].bv_page, 1,
+ should_dirty);
+   } else {
if (should_dirty)
-   set_page_dirty_lock(bvecs[i].bv_page);
-   put_page(bvecs[i].bv_page);
+   set_page_dirty_lock(bv[i].bv_page);
+   put_page(bv[i].bv_page);
}
}
-   kvfree(bvecs);
+   kvfree(bv);
 }
 
 /*
@@ -730,6 +738,7 @@ struct ceph_aio_work {
 };
 
 static void ceph_aio_retry_work(struct work_struct *work);
+static void ceph_aio_from_gup_retry_work(struct work_struct *work);
 
 static void ceph_aio_complete(struct inode *inode,
  struct ceph_aio_request *aio_req)
@@ -774,7 +783,7 @@ static void ceph_aio_complete(struct inode *inode,
kfree(aio_req);
 }
 
-static void ceph_aio_complete_req(struct ceph_osd_request *req)
+static void _ceph_aio_complete_req(struct ceph_osd_request *req, bool from_gup)
 {
int rc = req->r_result;
struct inode *inode = req->r_inode;
@@ -793,7 +802,9 @@ static void ceph_aio_complete_req(struct ceph_osd_request 
*req)
 
aio_work = kmalloc(sizeof(*aio_work), GFP_NOFS);
if (aio_work) {
-   INIT_WORK(_work->work, ceph_aio_retry_work);
+   INIT_WORK(_work->work, from_gup ?
+ ceph_aio_from_gup_retry_work :
+ ceph_aio_retry_work);
aio_work->req = req;
queue_work(ceph_inode_to_client(inode)->inode_wq,
   _work->work);
@@ -830,7 +841,7 @@ static void ceph_aio_complete_req(struct ceph_osd_request 
*req)
}
 
put_bvecs(osd_data->bvec_pos.bvecs, osd_data->num_bvecs,
- aio_req->should_dirty);
+ aio_req->should_dirty, from_gup);
ceph_osdc_put_request(req);
 
if (rc < 0)
@@ -840,7 +851,17 @@ static void ceph_aio_complete_req(struct ceph_osd_request 
*req)
return;
 }
 
-static void ceph_aio_retry_work(struct work_struct *work)
+static void ceph_aio_complete_req(struct ceph_osd_request *req)
+{
+   _ceph_aio_complete_req(req, false);
+}
+
+static void ceph_aio_from_gup_complete_req(struct ceph_osd_request *req)
+{
+   _ceph_aio_complete_req(req, true);
+}
+
+static void _ceph_aio_retry_work(struct work_struct *work, bool from_gup)
 {
struct ceph_aio_work *aio_work =
container_of(work, struct ceph_aio_work, work);
@@ -891,7 +912,8 @@ static void ceph_aio_retry_work(struct work_struct *work)
 
ceph_osdc_put_request(orig_req);
 
-   req->r_callback = ceph_aio_complete_req;
+   req->r_callback = from_gup ? ceph_aio_from_gup_complete_req :
+ ceph_aio_complete_req;
req->r_inode = inode;
req->r_priv = aio_req;
 
@@ -899,13 +921,23 @@ static void ceph_aio_retry_work(struct work_struct *work)
 out:
if (ret < 0) {
req->r_result = ret;
-   ceph_aio_complete_req(req);
+   _ceph_aio_complete_req(req, from_gup);
}
 
ceph_put_snap_context(snapc);
kf

[PATCH 08/12] fs/cifs: convert put_page() to put_user_page*()

2019-07-24 Thread john . hubbard
From: Jérôme Glisse 

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Signed-off-by: Jérôme Glisse 
Signed-off-by: John Hubbard 
Cc: linux-fsde...@vger.kernel.org
Cc: linux-bl...@vger.kernel.org
Cc: linux...@kvack.org
Cc: linux-c...@vger.kernel.org
Cc: Jan Kara 
Cc: Dan Williams 
Cc: Alexander Viro 
Cc: Johannes Thumshirn 
Cc: Christoph Hellwig 
Cc: Jens Axboe 
Cc: Ming Lei 
Cc: Dave Chinner 
Cc: Jason Gunthorpe 
Cc: Matthew Wilcox 
Cc: Boaz Harrosh 
Cc: Steve French 
---
 fs/cifs/cifsglob.h |  3 +++
 fs/cifs/file.c | 22 +-
 fs/cifs/misc.c | 19 +++
 3 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index fe610e7e3670..e95cb82bfa50 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -1283,6 +1283,7 @@ struct cifs_aio_ctx {
 * If yes, iter is a copy of the user passed iov_iter
 */
booldirect_io;
+   boolfrom_gup;
 };
 
 struct cifs_readdata;
@@ -1317,6 +1318,7 @@ struct cifs_readdata {
struct cifs_credits credits;
unsigned intnr_pages;
struct page **pages;
+   boolfrom_gup;
 };
 
 struct cifs_writedata;
@@ -1343,6 +1345,7 @@ struct cifs_writedata {
struct cifs_credits credits;
unsigned intnr_pages;
struct page **pages;
+   boolfrom_gup;
 };
 
 /*
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 97090693d182..84fa7e0a578f 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -2571,8 +2571,13 @@ cifs_uncached_writedata_release(struct kref *refcount)
struct cifs_writedata, refcount);
 
kref_put(>ctx->refcount, cifs_aio_ctx_release);
-   for (i = 0; i < wdata->nr_pages; i++)
-   put_page(wdata->pages[i]);
+   if (wdata->from_gup) {
+   for (i = 0; i < wdata->nr_pages; i++)
+   put_user_page(wdata->pages[i]);
+   } else {
+   for (i = 0; i < wdata->nr_pages; i++)
+   put_page(wdata->pages[i]);
+   }
cifs_writedata_release(refcount);
 }
 
@@ -2781,7 +2786,7 @@ cifs_write_from_iter(loff_t offset, size_t len, struct 
iov_iter *from,
break;
}
 
-
+   wdata->from_gup = iov_iter_get_pages_use_gup(from);
wdata->page_offset = start;
wdata->tailsz =
nr_pages > 1 ?
@@ -2797,6 +2802,7 @@ cifs_write_from_iter(loff_t offset, size_t len, struct 
iov_iter *from,
add_credits_and_wake_if(server, credits, 0);
break;
}
+   wdata->from_gup = false;
 
rc = cifs_write_allocate_pages(wdata->pages, nr_pages);
if (rc) {
@@ -3238,8 +3244,12 @@ cifs_uncached_readdata_release(struct kref *refcount)
unsigned int i;
 
kref_put(>ctx->refcount, cifs_aio_ctx_release);
-   for (i = 0; i < rdata->nr_pages; i++) {
-   put_page(rdata->pages[i]);
+   if (rdata->from_gup) {
+   for (i = 0; i < rdata->nr_pages; i++)
+   put_user_page(rdata->pages[i]);
+   } else {
+   for (i = 0; i < rdata->nr_pages; i++)
+   put_page(rdata->pages[i]);
}
cifs_readdata_release(refcount);
 }
@@ -3502,6 +3512,7 @@ cifs_send_async_read(loff_t offset, size_t len, struct 
cifsFileInfo *open_file,
break;
}
 
+   rdata->from_gup = 
iov_iter_get_pages_use_gup(_iov);
npages = (cur_len + start + PAGE_SIZE-1) / PAGE_SIZE;
rdata->page_offset = start;
rdata->tailsz = npages > 1 ?
@@ -3519,6 +3530,7 @@ cifs_send_async_read(loff_t offset, size_t len, struct 
cifsFileInfo *open_file,
rc = -ENOMEM;
break;
}
+   rdata->from_gup = false;
 
rc = cifs_read_allocate_pages(rdata, npages);
if (rc) {
diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
index f383877a6511..5a04c34fea05 100644
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@ -822,10 +822,18 @@ cifs_aio_ctx_release(struct kref *refcount)
if (ctx->b

[PATCH 07/12] vhost-scsi: convert put_page() to put_user_page*()

2019-07-24 Thread john . hubbard
From: Jérôme Glisse 

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Changes from Jérôme's original patch:

* Changed a WARN_ON to a BUG_ON.

Signed-off-by: Jérôme Glisse 
Signed-off-by: John Hubbard 
Cc: virtualization@lists.linux-foundation.org
Cc: linux-fsde...@vger.kernel.org
Cc: linux-bl...@vger.kernel.org
Cc: linux...@kvack.org
Cc: Jan Kara 
Cc: Dan Williams 
Cc: Alexander Viro 
Cc: Johannes Thumshirn 
Cc: Christoph Hellwig 
Cc: Jens Axboe 
Cc: Ming Lei 
Cc: Dave Chinner 
Cc: Jason Gunthorpe 
Cc: Matthew Wilcox 
Cc: Boaz Harrosh 
Cc: Miklos Szeredi 
Cc: "Michael S. Tsirkin" 
Cc: Jason Wang 
Cc: Paolo Bonzini 
Cc: Stefan Hajnoczi 
---
 drivers/vhost/scsi.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index a9caf1bc3c3e..282565ab5e3f 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -329,11 +329,11 @@ static void vhost_scsi_release_cmd(struct se_cmd *se_cmd)
 
if (tv_cmd->tvc_sgl_count) {
for (i = 0; i < tv_cmd->tvc_sgl_count; i++)
-   put_page(sg_page(_cmd->tvc_sgl[i]));
+   put_user_page(sg_page(_cmd->tvc_sgl[i]));
}
if (tv_cmd->tvc_prot_sgl_count) {
for (i = 0; i < tv_cmd->tvc_prot_sgl_count; i++)
-   put_page(sg_page(_cmd->tvc_prot_sgl[i]));
+   put_user_page(sg_page(_cmd->tvc_prot_sgl[i]));
}
 
vhost_scsi_put_inflight(tv_cmd->inflight);
@@ -630,6 +630,13 @@ vhost_scsi_map_to_sgl(struct vhost_scsi_cmd *cmd,
size_t offset;
unsigned int npages = 0;
 
+   /*
+* Here in all cases we should have an IOVEC which use GUP. If that is
+* not the case then we will wrongly call put_user_page() and the page
+* refcount will go wrong (this is in vhost_scsi_release_cmd())
+*/
+   WARN_ON(!iov_iter_get_pages_use_gup(iter));
+
bytes = iov_iter_get_pages(iter, pages, LONG_MAX,
VHOST_SCSI_PREALLOC_UPAGES, );
/* No pages were pinned */
@@ -681,7 +688,7 @@ vhost_scsi_iov_to_sgl(struct vhost_scsi_cmd *cmd, bool 
write,
while (p < sg) {
struct page *page = sg_page(p++);
if (page)
-   put_page(page);
+   put_user_page(page);
}
return ret;
}
-- 
2.22.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 06/12] fs/nfs: convert put_page() to put_user_page*()

2019-07-24 Thread john . hubbard
From: Jérôme Glisse 

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Signed-off-by: Jérôme Glisse 
Signed-off-by: John Hubbard 
Cc: linux-fsde...@vger.kernel.org
Cc: linux-bl...@vger.kernel.org
Cc: linux...@kvack.org
Cc: linux-...@vger.kernel.org
Cc: Jan Kara 
Cc: Dan Williams 
Cc: Alexander Viro 
Cc: Johannes Thumshirn 
Cc: Christoph Hellwig 
Cc: Jens Axboe 
Cc: Ming Lei 
Cc: Dave Chinner 
Cc: Jason Gunthorpe 
Cc: Matthew Wilcox 
Cc: Boaz Harrosh 
Cc: Trond Myklebust 
Cc: Anna Schumaker 
---
 fs/nfs/direct.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 0cb442406168..35f30fe2900f 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -512,7 +512,10 @@ static ssize_t nfs_direct_read_schedule_iovec(struct 
nfs_direct_req *dreq,
pos += req_len;
dreq->bytes_left -= req_len;
}
-   nfs_direct_release_pages(pagevec, npages);
+   if (iov_iter_get_pages_use_gup(iter))
+   put_user_pages(pagevec, npages);
+   else
+   nfs_direct_release_pages(pagevec, npages);
kvfree(pagevec);
if (result < 0)
break;
@@ -935,7 +938,10 @@ static ssize_t nfs_direct_write_schedule_iovec(struct 
nfs_direct_req *dreq,
pos += req_len;
dreq->bytes_left -= req_len;
}
-   nfs_direct_release_pages(pagevec, npages);
+   if (iov_iter_get_pages_use_gup(iter))
+   put_user_pages(pagevec, npages);
+   else
+   nfs_direct_release_pages(pagevec, npages);
kvfree(pagevec);
if (result < 0)
break;
-- 
2.22.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 04/12] block: bio_release_pages: convert put_page() to put_user_page*()

2019-07-24 Thread john . hubbard
From: Jérôme Glisse 

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Changes from Jérôme's original patch:
* reworked to be compatible with recent bio_release_pages() changes,
* refactored slightly to remove some code duplication,
* use an approach that changes fewer bio_check_pages_dirty()
  callers.

Signed-off-by: Jérôme Glisse 
Signed-off-by: John Hubbard 
Cc: Christoph Hellwig 
Cc: Minwoo Im 
Cc: Jens Axboe 
---
 block/bio.c | 60 -
 include/linux/bio.h |  1 +
 2 files changed, 49 insertions(+), 12 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 7675e2de509d..74f9eba2583b 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -844,7 +844,11 @@ void bio_release_pages(struct bio *bio, enum 
bio_rp_flags_t flags)
bio_for_each_segment_all(bvec, bio, iter_all) {
if ((flags & BIO_RP_MARK_DIRTY) && !PageCompound(bvec->bv_page))
set_page_dirty_lock(bvec->bv_page);
-   put_page(bvec->bv_page);
+
+   if (flags & BIO_RP_FROM_GUP)
+   put_user_page(bvec->bv_page);
+   else
+   put_page(bvec->bv_page);
}
 }
 
@@ -1667,28 +1671,50 @@ static void bio_dirty_fn(struct work_struct *work);
 static DECLARE_WORK(bio_dirty_work, bio_dirty_fn);
 static DEFINE_SPINLOCK(bio_dirty_lock);
 static struct bio *bio_dirty_list;
+static struct bio *bio_gup_dirty_list;
 
-/*
- * This runs in process context
- */
-static void bio_dirty_fn(struct work_struct *work)
+static void __bio_dirty_fn(struct work_struct *work,
+  struct bio **dirty_list,
+  enum bio_rp_flags_t flags)
 {
struct bio *bio, *next;
 
spin_lock_irq(_dirty_lock);
-   next = bio_dirty_list;
-   bio_dirty_list = NULL;
+   next = *dirty_list;
+   *dirty_list = NULL;
spin_unlock_irq(_dirty_lock);
 
while ((bio = next) != NULL) {
next = bio->bi_private;
 
-   bio_release_pages(bio, BIO_RP_MARK_DIRTY);
+   bio_release_pages(bio, BIO_RP_MARK_DIRTY | flags);
bio_put(bio);
}
 }
 
-void bio_check_pages_dirty(struct bio *bio)
+/*
+ * This runs in process context
+ */
+static void bio_dirty_fn(struct work_struct *work)
+{
+   __bio_dirty_fn(work, _dirty_list, BIO_RP_NORMAL);
+   __bio_dirty_fn(work, _gup_dirty_list, BIO_RP_FROM_GUP);
+}
+
+/**
+ * __bio_check_pages_dirty() - queue up pages on a workqueue to dirty them
+ * @bio: the bio struct containing the pages we should dirty
+ * @from_gup: did the pages in the bio came from GUP (get_user_pages*())
+ *
+ * This will go over all pages in the bio, and for each non dirty page, the
+ * bio is added to a list of bio's that need to get their pages dirtied.
+ *
+ * We also need to know if the pages in the bio are coming from GUP or not,
+ * as GUPed pages need to be released via put_user_page(), instead of
+ * put_page(). Please see Documentation/vm/get_user_pages.rst for details
+ * on that.
+ */
+void __bio_check_pages_dirty(struct bio *bio, bool from_gup)
 {
struct bio_vec *bvec;
unsigned long flags;
@@ -1699,17 +1725,27 @@ void bio_check_pages_dirty(struct bio *bio)
goto defer;
}
 
-   bio_release_pages(bio, BIO_RP_NORMAL);
+   bio_release_pages(bio, from_gup ? BIO_RP_FROM_GUP : BIO_RP_NORMAL);
bio_put(bio);
return;
 defer:
spin_lock_irqsave(_dirty_lock, flags);
-   bio->bi_private = bio_dirty_list;
-   bio_dirty_list = bio;
+   if (from_gup) {
+   bio->bi_private = bio_gup_dirty_list;
+   bio_gup_dirty_list = bio;
+   } else {
+   bio->bi_private = bio_dirty_list;
+   bio_dirty_list = bio;
+   }
spin_unlock_irqrestore(_dirty_lock, flags);
schedule_work(_dirty_work);
 }
 
+void bio_check_pages_dirty(struct bio *bio)
+{
+   __bio_check_pages_dirty(bio, false);
+}
+
 void update_io_ticks(struct hd_struct *part, unsigned long now)
 {
unsigned long stamp;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 2715e55679c1..d68a40c2c9d4 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -444,6 +444,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter 
*iter);
 enum bio_rp_flags_t {
BIO_RP_NORMAL   = 0,
BIO_RP_MARK_DIRTY   = 1,
+   BIO_RP_FROM_GUP = 2,
 };
 
 static inline enum bio_rp_flags_t bio_rp_dirty_flag(bool mark_dirty)
-- 
2.22.0

___
Virtualization mailing list
Virtualization@lists.l

[PATCH 05/12] block_dev: convert put_page() to put_user_page*()

2019-07-24 Thread john . hubbard
From: Jérôme Glisse 

For pages that were retained via get_user_pages*(), release those pages
via the new put_user_page*() routines, instead of via put_page() or
release_pages().

This is part a tree-wide conversion, as described in commit fc1d8e7cca2d
("mm: introduce put_user_page*(), placeholder versions").

Changes from Jérôme's original patch:

* reworked to be compatible with recent bio_release_pages() changes.

Signed-off-by: Jérôme Glisse 
Signed-off-by: John Hubbard 
Cc: linux-fsde...@vger.kernel.org
Cc: linux-bl...@vger.kernel.org
Cc: linux...@kvack.org
Cc: Jan Kara 
Cc: Dan Williams 
Cc: Alexander Viro 
Cc: Johannes Thumshirn 
Cc: Christoph Hellwig 
Cc: Jens Axboe 
Cc: Ming Lei 
Cc: Dave Chinner 
Cc: Jason Gunthorpe 
Cc: Matthew Wilcox 
Cc: Boaz Harrosh 
---
 block/bio.c | 13 +
 fs/block_dev.c  | 22 +-
 include/linux/bio.h |  8 
 3 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 74f9eba2583b..3b9f66e64bc1 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1746,6 +1746,19 @@ void bio_check_pages_dirty(struct bio *bio)
__bio_check_pages_dirty(bio, false);
 }
 
+enum bio_rp_flags_t bio_rp_flags(struct iov_iter *iter, bool mark_dirty)
+{
+   enum bio_rp_flags_t flags = BIO_RP_NORMAL;
+
+   if (mark_dirty)
+   flags |= BIO_RP_MARK_DIRTY;
+
+   if (iov_iter_get_pages_use_gup(iter))
+   flags |= BIO_RP_FROM_GUP;
+
+   return flags;
+}
+
 void update_io_ticks(struct hd_struct *part, unsigned long now)
 {
unsigned long stamp;
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 9fe6616f8788..d53abaf31e54 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -259,7 +259,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct 
iov_iter *iter,
}
__set_current_state(TASK_RUNNING);
 
-   bio_release_pages(, bio_rp_dirty_flag(should_dirty));
+   bio_release_pages(, bio_rp_flags(iter, should_dirty));
if (unlikely(bio.bi_status))
ret = blk_status_to_errno(bio.bi_status);
 
@@ -295,7 +295,7 @@ static int blkdev_iopoll(struct kiocb *kiocb, bool wait)
return blk_poll(q, READ_ONCE(kiocb->ki_cookie), wait);
 }
 
-static void blkdev_bio_end_io(struct bio *bio)
+static void _blkdev_bio_end_io(struct bio *bio, bool from_gup)
 {
struct blkdev_dio *dio = bio->bi_private;
bool should_dirty = dio->should_dirty;
@@ -327,13 +327,23 @@ static void blkdev_bio_end_io(struct bio *bio)
}
 
if (should_dirty) {
-   bio_check_pages_dirty(bio);
+   __bio_check_pages_dirty(bio, from_gup);
} else {
-   bio_release_pages(bio, BIO_RP_NORMAL);
+   bio_release_pages(bio, bio_rp_gup_flag(from_gup));
bio_put(bio);
}
 }
 
+static void blkdev_bio_end_io(struct bio *bio)
+{
+   _blkdev_bio_end_io(bio, false);
+}
+
+static void blkdev_bio_from_gup_end_io(struct bio *bio)
+{
+   _blkdev_bio_end_io(bio, true);
+}
+
 static ssize_t
 __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter *iter, int nr_pages)
 {
@@ -380,7 +390,9 @@ __blkdev_direct_IO(struct kiocb *iocb, struct iov_iter 
*iter, int nr_pages)
bio->bi_iter.bi_sector = pos >> 9;
bio->bi_write_hint = iocb->ki_hint;
bio->bi_private = dio;
-   bio->bi_end_io = blkdev_bio_end_io;
+   bio->bi_end_io = iov_iter_get_pages_use_gup(iter) ?
+blkdev_bio_from_gup_end_io :
+blkdev_bio_end_io;
bio->bi_ioprio = iocb->ki_ioprio;
 
ret = bio_iov_iter_get_pages(bio, iter);
diff --git a/include/linux/bio.h b/include/linux/bio.h
index d68a40c2c9d4..b9460d1a4679 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -452,6 +452,13 @@ static inline enum bio_rp_flags_t bio_rp_dirty_flag(bool 
mark_dirty)
return mark_dirty ? BIO_RP_MARK_DIRTY : BIO_RP_NORMAL;
 }
 
+static inline enum bio_rp_flags_t bio_rp_gup_flag(bool from_gup)
+{
+   return from_gup ? BIO_RP_FROM_GUP : BIO_RP_NORMAL;
+}
+
+enum bio_rp_flags_t bio_rp_flags(struct iov_iter *iter, bool mark_dirty);
+
 void bio_release_pages(struct bio *bio, enum bio_rp_flags_t flags);
 struct rq_map_data;
 extern struct bio *bio_map_user_iov(struct request_queue *,
@@ -463,6 +470,7 @@ extern struct bio *bio_copy_kern(struct request_queue *, 
void *, unsigned int,
 gfp_t, int);
 extern void bio_set_pages_dirty(struct bio *bio);
 extern void bio_check_pages_dirty(struct bio *bio);
+void __bio_check_pages_dirty(struct bio *bio, bool from_gup);
 
 void generic_start_io_acct(struct request_queue *q, int op,
unsigned long sectors, struct hd_struct *part);
-- 
2.22.0

___
Virtualization mailing lis

[PATCH 03/12] block: bio_release_pages: use flags arg instead of bool

2019-07-24 Thread john . hubbard
From: John Hubbard 

In commit d241a95f3514 ("block: optionally mark pages dirty in
bio_release_pages"), new "bool mark_dirty" argument was added to
bio_release_pages.

In upcoming work, another bool argument (to indicate that the pages came
from get_user_pages) is going to be added. That's one bool too many,
because it's not desirable have calls of the form:

foo(true, false, true, etc);

In order to prepare for that, change the argument from a bool, to a
typesafe (enum-based) flags argument.

Cc: Christoph Hellwig 
Cc: Jérôme Glisse 
Cc: Minwoo Im 
Cc: Jens Axboe 
Signed-off-by: John Hubbard 
---
 block/bio.c | 12 ++--
 fs/block_dev.c  |  4 ++--
 fs/direct-io.c  |  2 +-
 include/linux/bio.h | 13 -
 4 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 299a0e7651ec..7675e2de509d 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -833,7 +833,7 @@ int bio_add_page(struct bio *bio, struct page *page,
 }
 EXPORT_SYMBOL(bio_add_page);
 
-void bio_release_pages(struct bio *bio, bool mark_dirty)
+void bio_release_pages(struct bio *bio, enum bio_rp_flags_t flags)
 {
struct bvec_iter_all iter_all;
struct bio_vec *bvec;
@@ -842,7 +842,7 @@ void bio_release_pages(struct bio *bio, bool mark_dirty)
return;
 
bio_for_each_segment_all(bvec, bio, iter_all) {
-   if (mark_dirty && !PageCompound(bvec->bv_page))
+   if ((flags & BIO_RP_MARK_DIRTY) && !PageCompound(bvec->bv_page))
set_page_dirty_lock(bvec->bv_page);
put_page(bvec->bv_page);
}
@@ -1421,7 +1421,7 @@ struct bio *bio_map_user_iov(struct request_queue *q,
return bio;
 
  out_unmap:
-   bio_release_pages(bio, false);
+   bio_release_pages(bio, BIO_RP_NORMAL);
bio_put(bio);
return ERR_PTR(ret);
 }
@@ -1437,7 +1437,7 @@ struct bio *bio_map_user_iov(struct request_queue *q,
  */
 void bio_unmap_user(struct bio *bio)
 {
-   bio_release_pages(bio, bio_data_dir(bio) == READ);
+   bio_release_pages(bio, bio_rp_dirty_flag(bio_data_dir(bio) == READ));
bio_put(bio);
bio_put(bio);
 }
@@ -1683,7 +1683,7 @@ static void bio_dirty_fn(struct work_struct *work)
while ((bio = next) != NULL) {
next = bio->bi_private;
 
-   bio_release_pages(bio, true);
+   bio_release_pages(bio, BIO_RP_MARK_DIRTY);
bio_put(bio);
}
 }
@@ -1699,7 +1699,7 @@ void bio_check_pages_dirty(struct bio *bio)
goto defer;
}
 
-   bio_release_pages(bio, false);
+   bio_release_pages(bio, BIO_RP_NORMAL);
bio_put(bio);
return;
 defer:
diff --git a/fs/block_dev.c b/fs/block_dev.c
index 4707dfff991b..9fe6616f8788 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -259,7 +259,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct 
iov_iter *iter,
}
__set_current_state(TASK_RUNNING);
 
-   bio_release_pages(, should_dirty);
+   bio_release_pages(, bio_rp_dirty_flag(should_dirty));
if (unlikely(bio.bi_status))
ret = blk_status_to_errno(bio.bi_status);
 
@@ -329,7 +329,7 @@ static void blkdev_bio_end_io(struct bio *bio)
if (should_dirty) {
bio_check_pages_dirty(bio);
} else {
-   bio_release_pages(bio, false);
+   bio_release_pages(bio, BIO_RP_NORMAL);
bio_put(bio);
}
 }
diff --git a/fs/direct-io.c b/fs/direct-io.c
index ae196784f487..423ef431ddda 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -551,7 +551,7 @@ static blk_status_t dio_bio_complete(struct dio *dio, 
struct bio *bio)
if (dio->is_async && should_dirty) {
bio_check_pages_dirty(bio); /* transfers ownership */
} else {
-   bio_release_pages(bio, should_dirty);
+   bio_release_pages(bio, bio_rp_dirty_flag(should_dirty));
bio_put(bio);
}
return err;
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 3cdb84cdc488..2715e55679c1 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -440,7 +440,18 @@ bool __bio_try_merge_page(struct bio *bio, struct page 
*page,
 void __bio_add_page(struct bio *bio, struct page *page,
unsigned int len, unsigned int off);
 int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter);
-void bio_release_pages(struct bio *bio, bool mark_dirty);
+
+enum bio_rp_flags_t {
+   BIO_RP_NORMAL   = 0,
+   BIO_RP_MARK_DIRTY   = 1,
+};
+
+static inline enum bio_rp_flags_t bio_rp_dirty_flag(bool mark_dirty)
+{
+   return mark_dirty ? BIO_RP_MARK_DIRTY : BIO_RP_NORMAL;
+}
+
+void bio_release_pages(struct bio *bio, enum bio_rp_flags_t flags);
 struct rq_map_data;
 extern struct bio *bio_map_user_iov(struct request_que

[PATCH 02/12] iov_iter: add helper to test if an iter would use GUP v2

2019-07-24 Thread john . hubbard
From: Jérôme Glisse 

Add a helper to test if call to iov_iter_get_pages*() with a given
iter would result in calls to GUP (get_user_pages*()). We want to
use different tracking of page references if they are coming from
GUP (get_user_pages*()) and thus  we need to know when GUP is used
for a given iter.

Changes since Jérôme's original patch:

* iov_iter_get_pages_use_gup(): do not return true for the ITER_PIPE
case, because iov_iter_get_pages() calls pipe_get_pages(), which in
turn uses get_page(), not get_user_pages().

* Remove some obsolete code, as part of rebasing onto Linux 5.3.

* Fix up the kerneldoc comment to "Return:" rather than "Returns:",
and a few other grammatical tweaks.

Signed-off-by: Jérôme Glisse 
Signed-off-by: John Hubbard 
Cc: linux-fsde...@vger.kernel.org
Cc: linux-bl...@vger.kernel.org
Cc: linux...@kvack.org
Cc: John Hubbard 
Cc: Jan Kara 
Cc: Dan Williams 
Cc: Alexander Viro 
Cc: Johannes Thumshirn 
Cc: Christoph Hellwig 
Cc: Jens Axboe 
Cc: Ming Lei 
Cc: Dave Chinner 
Cc: Jason Gunthorpe 
Cc: Matthew Wilcox 
---
 include/linux/uio.h | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index ab5f523bc0df..2a179af8e5a7 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -86,6 +86,17 @@ static inline unsigned char iov_iter_rw(const struct 
iov_iter *i)
return i->type & (READ | WRITE);
 }
 
+/**
+ * iov_iter_get_pages_use_gup - report if iov_iter_get_pages(i) uses GUP
+ * @i: iterator
+ * Return: true if a call to iov_iter_get_pages*() with the iter provided in
+ *  the argument would result in the use of get_user_pages*()
+ */
+static inline bool iov_iter_get_pages_use_gup(const struct iov_iter *i)
+{
+   return iov_iter_type(i) == ITER_IOVEC;
+}
+
 /*
  * Total number of bytes covered by an iovec.
  *
-- 
2.22.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH 01/12] mm/gup: add make_dirty arg to put_user_pages_dirty_lock()

2019-07-24 Thread john . hubbard
From: John Hubbard 

Provide more capable variation of put_user_pages_dirty_lock(),
and delete put_user_pages_dirty(). This is based on the
following:

1. Lots of call sites become simpler if a bool is passed
into put_user_page*(), instead of making the call site
choose which put_user_page*() variant to call.

2. Christoph Hellwig's observation that set_page_dirty_lock()
is usually correct, and set_page_dirty() is usually a
bug, or at least questionable, within a put_user_page*()
calling chain.

This leads to the following API choices:

* put_user_pages_dirty_lock(page, npages, make_dirty)

* There is no put_user_pages_dirty(). You have to
  hand code that, in the rare case that it's
  required.

Cc: Matthew Wilcox 
Cc: Jan Kara 
Cc: Christoph Hellwig 
Cc: Ira Weiny 
Cc: Jason Gunthorpe 
Signed-off-by: John Hubbard 
---
 drivers/infiniband/core/umem.c |   5 +-
 drivers/infiniband/hw/hfi1/user_pages.c|   5 +-
 drivers/infiniband/hw/qib/qib_user_pages.c |   5 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c   |   5 +-
 drivers/infiniband/sw/siw/siw_mem.c|   8 +-
 include/linux/mm.h |   5 +-
 mm/gup.c   | 115 +
 7 files changed, 58 insertions(+), 90 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 08da840ed7ee..965cf9dea71a 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -54,10 +54,7 @@ static void __ib_umem_release(struct ib_device *dev, struct 
ib_umem *umem, int d
 
for_each_sg_page(umem->sg_head.sgl, _iter, umem->sg_nents, 0) {
page = sg_page_iter_page(_iter);
-   if (umem->writable && dirty)
-   put_user_pages_dirty_lock(, 1);
-   else
-   put_user_page(page);
+   put_user_pages_dirty_lock(, 1, umem->writable && dirty);
}
 
sg_free_table(>sg_head);
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c 
b/drivers/infiniband/hw/hfi1/user_pages.c
index b89a9b9aef7a..469acb961fbd 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -118,10 +118,7 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned 
long vaddr, size_t np
 void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
 size_t npages, bool dirty)
 {
-   if (dirty)
-   put_user_pages_dirty_lock(p, npages);
-   else
-   put_user_pages(p, npages);
+   put_user_pages_dirty_lock(p, npages, dirty);
 
if (mm) { /* during close after signal, mm can be NULL */
atomic64_sub(npages, >pinned_vm);
diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c 
b/drivers/infiniband/hw/qib/qib_user_pages.c
index bfbfbb7e0ff4..6bf764e41891 100644
--- a/drivers/infiniband/hw/qib/qib_user_pages.c
+++ b/drivers/infiniband/hw/qib/qib_user_pages.c
@@ -40,10 +40,7 @@
 static void __qib_release_user_pages(struct page **p, size_t num_pages,
 int dirty)
 {
-   if (dirty)
-   put_user_pages_dirty_lock(p, num_pages);
-   else
-   put_user_pages(p, num_pages);
+   put_user_pages_dirty_lock(p, num_pages, dirty);
 }
 
 /**
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c 
b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 0b0237d41613..62e6ffa9ad78 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -75,10 +75,7 @@ static void usnic_uiom_put_pages(struct list_head 
*chunk_list, int dirty)
for_each_sg(chunk->page_list, sg, chunk->nents, i) {
page = sg_page(sg);
pa = sg_phys(sg);
-   if (dirty)
-   put_user_pages_dirty_lock(, 1);
-   else
-   put_user_page(page);
+   put_user_pages_dirty_lock(, 1, dirty);
usnic_dbg("pa: %pa\n", );
}
kfree(chunk);
diff --git a/drivers/infiniband/sw/siw/siw_mem.c 
b/drivers/infiniband/sw/siw/siw_mem.c
index 67171c82b0c4..358d440efa11 100644
--- a/drivers/infiniband/sw/siw/siw_mem.c
+++ b/drivers/infiniband/sw/siw/siw_mem.c
@@ -65,13 +65,7 @@ static void siw_free_plist(struct siw_page_chunk *chunk, int 
num_pages,
 {
struct page **p = chunk->plist;
 
-   while (num_pages--) {
-   if (!PageDirty(*p) && dirty)
-   put_user_pages_dirty_lock(p, 1);
-   else
-   put_user_page(*p);
-   p++;
-   }
+   put_user_pages_dirty_lock(chunk->plist, num_pages, dirty);
 }
 
 void siw_umem_release(struct siw_umem *umem, bool dirty)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0334ca97

[PATCH 00/12] block/bio, fs: convert put_page() to put_user_page*()

2019-07-24 Thread john . hubbard
From: John Hubbard 

Hi,

This is mostly Jerome's work, converting the block/bio and related areas
to call put_user_page*() instead of put_page(). Because I've changed
Jerome's patches, in some cases significantly, I'd like to get his
feedback before we actually leave him listed as the author (he might
want to disown some or all of these).

I added a new patch, in order to make this work with Christoph Hellwig's
recent overhaul to bio_release_pages(): "block: bio_release_pages: use
flags arg instead of bool".

I've started the series with a patch that I've posted in another
series ("mm/gup: add make_dirty arg to put_user_pages_dirty_lock()"[1]),
because I'm not sure which of these will go in first, and this allows each
to stand alone.

Testing: not much beyond build and boot testing has been done yet. And
I'm not set up to even exercise all of it (especially the IB parts) at
run time.

Anyway, changes here are:

* Store, in the iov_iter, a "came from gup (get_user_pages)" parameter.
  Then, use the new iov_iter_get_pages_use_gup() to retrieve it when
  it is time to release the pages. That allows choosing between put_page()
  and put_user_page*().

* Pass in one more piece of information to bio_release_pages: a "from_gup"
  parameter. Similar use as above.

* Change the block layer, and several file systems, to use
  put_user_page*().

[1] https://lore.kernel.org/r/20190724012606.25844-2-jhubb...@nvidia.com
And please note the correction email that I posted as a follow-up,
if you're looking closely at that patch. :) The fixed version is
included here.

John Hubbard (3):
  mm/gup: add make_dirty arg to put_user_pages_dirty_lock()
  block: bio_release_pages: use flags arg instead of bool
  fs/ceph: fix a build warning: returning a value from void function

Jérôme Glisse (9):
  iov_iter: add helper to test if an iter would use GUP v2
  block: bio_release_pages: convert put_page() to put_user_page*()
  block_dev: convert put_page() to put_user_page*()
  fs/nfs: convert put_page() to put_user_page*()
  vhost-scsi: convert put_page() to put_user_page*()
  fs/cifs: convert put_page() to put_user_page*()
  fs/fuse: convert put_page() to put_user_page*()
  fs/ceph: convert put_page() to put_user_page*()
  9p/net: convert put_page() to put_user_page*()

 block/bio.c|  81 ---
 drivers/infiniband/core/umem.c |   5 +-
 drivers/infiniband/hw/hfi1/user_pages.c|   5 +-
 drivers/infiniband/hw/qib/qib_user_pages.c |   5 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c   |   5 +-
 drivers/infiniband/sw/siw/siw_mem.c|   8 +-
 drivers/vhost/scsi.c   |  13 ++-
 fs/block_dev.c |  22 +++-
 fs/ceph/debugfs.c  |   2 +-
 fs/ceph/file.c |  62 ---
 fs/cifs/cifsglob.h |   3 +
 fs/cifs/file.c |  22 +++-
 fs/cifs/misc.c |  19 +++-
 fs/direct-io.c |   2 +-
 fs/fuse/dev.c  |  22 +++-
 fs/fuse/file.c |  53 +++---
 fs/nfs/direct.c|  10 +-
 include/linux/bio.h|  22 +++-
 include/linux/mm.h |   5 +-
 include/linux/uio.h|  11 ++
 mm/gup.c   | 115 +
 net/9p/trans_common.c  |  14 ++-
 net/9p/trans_common.h  |   3 +-
 net/9p/trans_virtio.c  |  18 +++-
 24 files changed, 357 insertions(+), 170 deletions(-)

-- 
2.22.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization