Re: [PATCH v2 07/21] crypto: shash, caam: Make use of the new sg_map helper function

2017-04-28 Thread Logan Gunthorpe


On 28/04/17 11:51 AM, Herbert Xu wrote:
> On Fri, Apr 28, 2017 at 10:53:45AM -0600, Logan Gunthorpe wrote:
>>
>>
>> On 28/04/17 12:30 AM, Herbert Xu wrote:
>>> You are right.  Indeed the existing code looks buggy as they
>>> don't take sg->offset into account when doing the kmap.  Could
>>> you send me some patches that fix these problems first so that
>>> they can be easily backported?
>>
>> Ok, I think the only buggy one in crypto is hifn_795x. Shash and caam
>> both do have the sg->offset accounted for. I'll send a patch for the
>> buggy one shortly.
> 
> I think they're all buggy when sg->offset is greater than PAGE_SIZE.

Yes, technically. But that's a _very_ common mistake. Pretty nearly
every case I looked at did not take that into account. I don't think
sg's that point to more than one continuous page are all that common.

Fixing all those cases without making a common function is a waste of
time IMO.

Logan


Re: [PATCH v2 07/21] crypto: shash, caam: Make use of the new sg_map helper function

2017-04-28 Thread Logan Gunthorpe


On 28/04/17 12:30 AM, Herbert Xu wrote:
> You are right.  Indeed the existing code looks buggy as they
> don't take sg->offset into account when doing the kmap.  Could
> you send me some patches that fix these problems first so that
> they can be easily backported?

Ok, I think the only buggy one in crypto is hifn_795x. Shash and caam
both do have the sg->offset accounted for. I'll send a patch for the
buggy one shortly.

Logan


Re: [PATCH v2 15/21] xen-blkfront: Make use of the new sg_map helper function

2017-04-27 Thread Logan Gunthorpe


On 27/04/17 05:20 PM, Jason Gunthorpe wrote:
> It seems the most robust: test for iomem, and jump to a slow path
> copy, otherwise inline the kmap and memcpy
> 
> Every place doing memcpy from sgl will need that pattern to be
> correct.

Ok, sounds like a good place to start to me. I'll see what I can do for
a v3 of this set. Though, I probably won't send anything until after the
merge window.

>>> sg_miter will still fail when the sg contains __iomem, however I would
>>> expect that the sg_copy will work with iomem, by using the __iomem
>>> memcpy variant.
>>
>> Yes, that's true. Any sg_miters that ever see iomem will need to be
>> converted to support it. This isn't much different than the other
>> kmap(sg_page()) users I was converting that will also fail if they see
>> iomem. Though, I suspect an sg_miter user would be easier to convert to
>> iomem than a random kmap user.
> 
> How? sg_miter seems like the next nightmare down this path, what is
> sg_miter_next supposed to do when something hits an iomem sgl?

My proposal is roughly included in the draft I sent upthread. We add an
sg_miter flag indicating the iteratee supports iomem and if miter finds
iomem (with the support flag set) it sets ioaddr which is __iomem. The
iteratee then just needs to null check addr and ioaddr and perform the
appropriate action.

Logan



Re: [PATCH v2 15/21] xen-blkfront: Make use of the new sg_map helper function

2017-04-27 Thread Logan Gunthorpe


On 27/04/17 04:11 PM, Jason Gunthorpe wrote:
> On Thu, Apr 27, 2017 at 03:53:37PM -0600, Logan Gunthorpe wrote:
> Well, that is in the current form, with more users it would make sense
> to optimize for the single page case, eg by providing the existing
> call, providing a faster single-page-only variant of the copy, perhaps
> even one that is inlined.

Ok, does it make sense then to have an sg_copy_page_to_buffer (or some
such... I'm having trouble thinking of a sane name that isn't too long).
That just does k(un)map_atomic and memcpy? I could try that if it makes
sense to people.

>> Switching the for_each_sg to sg_miter is probably the nicer solution as
>> it takes care of the mapping and the offset/length accounting for you
>> and will have similar performance.
> 
> sg_miter will still fail when the sg contains __iomem, however I would
> expect that the sg_copy will work with iomem, by using the __iomem
> memcpy variant.

Yes, that's true. Any sg_miters that ever see iomem will need to be
converted to support it. This isn't much different than the other
kmap(sg_page()) users I was converting that will also fail if they see
iomem. Though, I suspect an sg_miter user would be easier to convert to
iomem than a random kmap user.

Logan


Re: [PATCH v2 15/21] xen-blkfront: Make use of the new sg_map helper function

2017-04-27 Thread Logan Gunthorpe


On 27/04/17 02:53 PM, Jason Gunthorpe wrote:
> blkfront is one of the drivers I looked at, and it appears to only be
> memcpying with the bvec_data pointer, so I wonder why it does not use
> sg_copy_X_buffer instead..

Yes, sort of...

But you'd potentially end up calling sg_copy_to_buffer multiple times
per page within the sg (given that gnttab_foreach_grant_in_range might
call blkif_copy_from_grant/blkif_setup_rw_req_grant multiple times).
Even calling sg_copy_to_buffer once per page seems rather inefficient as
it uses sg_miter internally.

Switching the for_each_sg to sg_miter is probably the nicer solution as
it takes care of the mapping and the offset/length accounting for you
and will have similar performance.

But, yes, if performance is not an issue, switching it to
sg_copy_to_buffer would be a less invasive change than sg_miter. Which
the same might be said about a lot of these cases.

Unfortunately, changing from kmap_atomic (which is a null operation in a
lot of cases) to sg_copy_X_buffer is a pretty big performance hit.

Logan


Re: [PATCH v2 15/21] xen-blkfront: Make use of the new sg_map helper function

2017-04-27 Thread Logan Gunthorpe


On 26/04/17 01:37 AM, Roger Pau Monné wrote:
> On Tue, Apr 25, 2017 at 12:21:02PM -0600, Logan Gunthorpe wrote:
>> Straightforward conversion to the new helper, except due to the lack
>> of error path, we have to use SG_MAP_MUST_NOT_FAIL which may BUG_ON in
>> certain cases in the future.
>>
>> Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
>> Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
>> Cc: Juergen Gross <jgr...@suse.com>
>> Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
>> Cc: "Roger Pau Monné" <roger@citrix.com>
>> ---
>>  drivers/block/xen-blkfront.c | 20 +++-
>>  1 file changed, 11 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
>> index 3945963..ed62175 100644
>> --- a/drivers/block/xen-blkfront.c
>> +++ b/drivers/block/xen-blkfront.c
>> @@ -816,8 +816,9 @@ static int blkif_queue_rw_req(struct request *req, 
>> struct blkfront_ring_info *ri
>>  BUG_ON(sg->offset + sg->length > PAGE_SIZE);
>>  
>>  if (setup.need_copy) {
>> -setup.bvec_off = sg->offset;
>> -setup.bvec_data = kmap_atomic(sg_page(sg));
>> +setup.bvec_off = 0;
>> +setup.bvec_data = sg_map(sg, 0, SG_KMAP_ATOMIC |
>> + SG_MAP_MUST_NOT_FAIL);
> 
> I assume that sg_map already adds sg->offset to the address?

Correct.

> Also wondering whether we can get rid of bvec_off and just increment 
> bvec_data,
> adding Julien who IIRC added this code.

bvec_off is used to keep track of the offset within the current mapping
so it's not a great idea given that you'd want to kunmap_atomic the
original address and not something with an offset. It would be nice if
this could be converted to use the sg_miter interface but that's a much
more invasive change that would require someone who knows this code and
can properly test it. I'd be very grateful if someone actually took that on.

Logan



Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

2017-04-27 Thread Logan Gunthorpe

On 26/04/17 01:44 AM, Christoph Hellwig wrote:
> I think we'll at least need a draft of those to make sense of these
> patches.  Otherwise they just look very clumsy.

Ok, what follows is a draft patch attempting to show where I'm thinking
of going with this. Obviously it will not compile because it assumes
the users throughout the kernel are a bit different than they are today.
Notably, there is no sg_page anymore.

There's also likely a ton of issues and arguments to have over a bunch
of the specifics below and I'd expect the concept to evolve more
as cleanup occurs. This itself is an evolution of the draft I posted
replying to you in my last RFC thread.

Also, before any of this is truly useful to us, pfn_t would have to
infect a few other places in the kernel.

Thanks,

Logan


diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index fad170b..85ef928 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -6,13 +6,14 @@
 #include 
 #include 
 #include 
+#include 
 #include 

 struct scatterlist {
 #ifdef CONFIG_DEBUG_SG
unsigned long   sg_magic;
 #endif
-   unsigned long   page_link;
+   pfn_t   pfn;
unsigned intoffset;
unsigned intlength;
dma_addr_t  dma_address;
@@ -60,15 +61,68 @@ struct sg_table {

 #define SG_MAGIC   0x87654321

-/*
- * We overload the LSB of the page pointer to indicate whether it's
- * a valid sg entry, or whether it points to the start of a new
scatterlist.
- * Those low bits are there for everyone! (thanks mason :-)
- */
-#define sg_is_chain(sg)((sg)->page_link & 0x01)
-#define sg_is_last(sg) ((sg)->page_link & 0x02)
-#define sg_chain_ptr(sg)   \
-   ((struct scatterlist *) ((sg)->page_link & ~0x03))
+static inline bool sg_is_chain(struct scatterlist *sg)
+{
+   return sg->pfn.val & PFN_SG_CHAIN;
+}
+
+static inline bool sg_is_last(struct scatterlist *sg)
+{
+   return sg->pfn.val & PFN_SG_LAST;
+}
+
+static inline struct scatterlist *sg_chain_ptr(struct scatterlist *sg)
+{
+   unsigned long sgl = pfn_t_to_pfn(sg->pfn);
+   return (struct scatterlist *)(sgl << PAGE_SHIFT);
+}
+
+static inline bool sg_is_iomem(struct scatterlist *sg)
+{
+   return pfn_t_is_iomem(sg->pfn);
+}
+
+/**
+ * sg_assign_pfn - Assign a given pfn_t to an SG entry
+ * @sg:SG entry
+ * @pfn:   The pfn
+ *
+ * Description:
+ *   Assign a pfn to sg entry. Also see sg_set_pfn(), the most commonly
used
+ *   variant.w
+ *
+ **/
+static inline void sg_assign_pfn(struct scatterlist *sg, pfn_t pfn)
+{
+#ifdef CONFIG_DEBUG_SG
+   BUG_ON(sg->sg_magic != SG_MAGIC);
+   BUG_ON(sg_is_chain(sg));
+   BUG_ON(pfn.val & (PFN_SG_CHAIN | PFN_SG_LAST));
+#endif
+
+   sg->pfn = pfn;
+}
+
+/**
+ * sg_set_pfn - Set sg entry to point at given pfn
+ * @sg: SG entry
+ * @pfn:The page
+ * @len:Length of data
+ * @offset: Offset into page
+ *
+ * Description:
+ *   Use this function to set an sg entry pointing at a pfn, never assign
+ *   the page directly. We encode sg table information in the lower bits
+ *   of the page pointer. See sg_pfn_t for looking up the pfn_t belonging
+ *   to an sg entry.
+ **/
+static inline void sg_set_pfn(struct scatterlist *sg, pfn_t pfn,
+ unsigned int len, unsigned int offset)
+{
+   sg_assign_pfn(sg, pfn);
+   sg->offset = offset;
+   sg->length = len;
+}

 /**
  * sg_assign_page - Assign a given page to an SG entry
@@ -82,18 +136,13 @@ struct sg_table {
  **/
 static inline void sg_assign_page(struct scatterlist *sg, struct page
*page)
 {
-   unsigned long page_link = sg->page_link & 0x3;
+   if (!page) {
+   pfn_t null_pfn = {0};
+   sg_assign_pfn(sg, null_pfn);
+   return;
+   }

-   /*
-* In order for the low bit stealing approach to work, pages
-* must be aligned at a 32-bit boundary as a minimum.
-*/
-   BUG_ON((unsigned long) page & 0x03);
-#ifdef CONFIG_DEBUG_SG
-   BUG_ON(sg->sg_magic != SG_MAGIC);
-   BUG_ON(sg_is_chain(sg));
-#endif
-   sg->page_link = page_link | (unsigned long) page;
+   sg_assign_pfn(sg, page_to_pfn_t(page));
 }

 /**
@@ -106,8 +155,7 @@ static inline void sg_assign_page(struct scatterlist
*sg, struct page *page)
  * Description:
  *   Use this function to set an sg entry pointing at a page, never assign
  *   the page directly. We encode sg table information in the lower bits
- *   of the page pointer. See sg_page() for looking up the page belonging
- *   to an sg entry.
+ *   of the page pointer.
  *
  **/
 static inline void sg_set_page(struct scatterlist *sg, struct page *page,
@@ -118,13 +166,53 @@ static inline void sg_set_page(struct scatterlist
*sg, struct page *page,
sg->length = len;
 }

-static inline struct page *sg_page(struct scatterlist *sg)
+/**
+ * sg_pfn_t - Return the 

Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

2017-04-27 Thread Logan Gunthorpe


On 27/04/17 09:27 AM, Jason Gunthorpe wrote:
> On Thu, Apr 27, 2017 at 08:53:38AM +0200, Christoph Hellwig wrote:
> How about first switching as many call sites as possible to use
> sg_copy_X_buffer instead of kmap?

Yeah, I could look at doing that first.

One problem is we might get more Naks of the form of Herbert Xu's who
might be concerned with the performance implications.

These are definitely a bit more invasive changes than thin wrappers
around kmap calls.

> A random audit of Logan's series suggests this is actually a fairly
> common thing.

It's not _that_ common but there are a significant fraction. One of my
patches actually did this to two places that seemed to be reimplementing
the sg_copy_X_buffer logic.

Thanks,

Logan


Re: [PATCH v2 07/21] crypto: shash, caam: Make use of the new sg_map helper function

2017-04-27 Thread Logan Gunthorpe


On 26/04/17 09:56 PM, Herbert Xu wrote:
> On Tue, Apr 25, 2017 at 12:20:54PM -0600, Logan Gunthorpe wrote:
>> Very straightforward conversion to the new function in the caam driver
>> and shash library.
>>
>> Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
>> Cc: Herbert Xu <herb...@gondor.apana.org.au>
>> Cc: "David S. Miller" <da...@davemloft.net>
>> ---
>>  crypto/shash.c| 9 ++---
>>  drivers/crypto/caam/caamalg.c | 8 +++-
>>  2 files changed, 9 insertions(+), 8 deletions(-)
>>
>> diff --git a/crypto/shash.c b/crypto/shash.c
>> index 5e31c8d..5914881 100644
>> --- a/crypto/shash.c
>> +++ b/crypto/shash.c
>> @@ -283,10 +283,13 @@ int shash_ahash_digest(struct ahash_request *req, 
>> struct shash_desc *desc)
>>  if (nbytes < min(sg->length, ((unsigned int)(PAGE_SIZE)) - offset)) {
>>  void *data;
>>  
>> -data = kmap_atomic(sg_page(sg));
>> -err = crypto_shash_digest(desc, data + offset, nbytes,
>> +data = sg_map(sg, 0, SG_KMAP_ATOMIC);
>> +if (IS_ERR(data))
>> +return PTR_ERR(data);
>> +
>> +err = crypto_shash_digest(desc, data, nbytes,
>>req->result);
>> -kunmap_atomic(data);
>> +sg_unmap(sg, data, 0, SG_KMAP_ATOMIC);
>>  crypto_yield(desc->flags);
>>  } else
>>  err = crypto_shash_init(desc) ?:
> 
> Nack.  This is an optimisation for the special case of a single
> SG list entry.  In fact in the common case the kmap_atomic should
> disappear altogether in the no-highmem case.  So replacing it
> with sg_map is not acceptable.

What you seem to have missed is that sg_map is just a thin wrapper
around kmap_atomic. Perhaps with a future check for a mappable page.
This change should have zero impact on performance.

Logan


Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

2017-04-27 Thread Logan Gunthorpe


On 27/04/17 12:53 AM, Christoph Hellwig wrote:
> I think you'll need to follow the existing kmap semantics and never
> fail the iomem version either.  Otherwise you'll have a special case
> that's almost never used that has a different error path.
>
> Again, wrong way.  Suddenly making things fail for your special case
> that normally don't fail is a receipe for bugs.

I don't disagree but these restrictions make the problem impossible to
solve? If there is iomem behind a page in an SGL and someone tries to
map it, we either have to fail or we break iomem safety which was your
original concern.

Logan



Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

2017-04-26 Thread Logan Gunthorpe


On 26/04/17 02:59 AM,   wrote:
> Good to know that somebody is working on this. Those problems troubled
> us as well.

Thanks Christian. It's a daunting problem and a there's a lot of work to
do before we will ever be where we need to be so any help, even an ack,
is greatly appreciated.

Logan



Re: [PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

2017-04-26 Thread Logan Gunthorpe


On 26/04/17 01:44 AM, Christoph Hellwig wrote:
> I think we'll at least need a draft of those to make sense of these
> patches.  Otherwise they just look very clumsy.

Ok, I'll work up a draft proposal and send it in a couple days. But
without a lot of cleanup such as this series it's not going to even be
able to compile.

> I'm sorry but this API is just a trainwreck.  Right now we have the
> nice little kmap_atomic API, which never fails and has a very nice
> calling convention where we just pass back the return address, but does
> not support sleeping inside the critical section.
> 
> And kmap, whіch may fail and requires the original page to be passed
> back.  Anything that mixes these two concepts up is simply a non-starter.

Ok, well for starters I think you are mistaken about kmap being able to
fail. I'm having a hard time finding many users of that function that
bother to check for an error when calling it. The main difficulty we
have now is that neither of those functions are expected to fail and we
need them to be able to in cases where the page doesn't map to system
RAM. This patch series is trying to address it for users of scatterlist.
I'm certainly open to other suggestions.

I also have to disagree that kmap and kmap_atomic are all that "nice".
Except for the sleeping restriction and performance, they effectively do
the same thing. And it was necessary to write a macro wrapper around
kunmap_atomic to ensure that users of that function don't screw it up.
(See 597781f3e5.) I'd say the kmap/kmap_atomic functions are the
trainwreck and I'm trying to do my best to cleanup a few cases.

There are a fair number of cases in the kernel that do something like:

if (something)
x = kmap(page);
else
x = kmap_atomic(page);
...
if (something)
kunmap(page)
else
kunmap_atomic(x)

Which just seems cumbersome to me.

In any case, if you can accept an sg_kmap and sg_kmap_atomic api just
say so and I'll make the change. But I'll still need a flags variable
for SG_MAP_MUST_NOT_FAIL to support legacy cases that have no fail path
and both of those functions will need to be pretty nearly replicas of
each other.

Logan




[PATCH v2 07/21] crypto: shash, caam: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
Very straightforward conversion to the new function in the caam driver
and shash library.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Herbert Xu <herb...@gondor.apana.org.au>
Cc: "David S. Miller" <da...@davemloft.net>
---
 crypto/shash.c| 9 ++---
 drivers/crypto/caam/caamalg.c | 8 +++-
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/crypto/shash.c b/crypto/shash.c
index 5e31c8d..5914881 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -283,10 +283,13 @@ int shash_ahash_digest(struct ahash_request *req, struct 
shash_desc *desc)
if (nbytes < min(sg->length, ((unsigned int)(PAGE_SIZE)) - offset)) {
void *data;
 
-   data = kmap_atomic(sg_page(sg));
-   err = crypto_shash_digest(desc, data + offset, nbytes,
+   data = sg_map(sg, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(data))
+   return PTR_ERR(data);
+
+   err = crypto_shash_digest(desc, data, nbytes,
  req->result);
-   kunmap_atomic(data);
+   sg_unmap(sg, data, 0, SG_KMAP_ATOMIC);
crypto_yield(desc->flags);
} else
err = crypto_shash_init(desc) ?:
diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
index 398807d..62d2f5d 100644
--- a/drivers/crypto/caam/caamalg.c
+++ b/drivers/crypto/caam/caamalg.c
@@ -89,7 +89,6 @@ static void dbg_dump_sg(const char *level, const char 
*prefix_str,
struct scatterlist *sg, size_t tlen, bool ascii)
 {
struct scatterlist *it;
-   void *it_page;
size_t len;
void *buf;
 
@@ -98,19 +97,18 @@ static void dbg_dump_sg(const char *level, const char 
*prefix_str,
 * make sure the scatterlist's page
 * has a valid virtual memory mapping
 */
-   it_page = kmap_atomic(sg_page(it));
-   if (unlikely(!it_page)) {
+   buf = sg_map(it, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(buf)) {
printk(KERN_ERR "dbg_dump_sg: kmap failed\n");
return;
}
 
-   buf = it_page + it->offset;
len = min_t(size_t, tlen, it->length);
print_hex_dump(level, prefix_str, prefix_type, rowsize,
   groupsize, buf, len, ascii);
tlen -= len;
 
-   kunmap_atomic(it_page);
+   sg_unmap(it, buf, 0, SG_KMAP_ATOMIC);
}
 }
 #endif
-- 
2.1.4



[PATCH v2 00/21] Introduce common scatterlist map function

2017-04-25 Thread Logan Gunthorpe
Changes since v1:

* Rebased onto next-20170424
* Removed the _offset version of these functions per Christoph's
  suggestion
* Added an SG_MAP_MUST_NOT_FAIL flag which will BUG_ON in future cases
  that can't gracefully fail. This removes a bunch of the noise added
  in v1 to a couple of the drivers. (Per David Laight's suggestion)
  This flag is only meant for old code
* Split the libiscsi patch into two (per Christoph's suggestion)
  the prep patch (patch 2 in this series) has already been
  sent separately
* Fixed a locking mistake in the target patch (pointed out by a bot)
* Dropped the nvmet patch and handled it with a different patch
  that has been sent separately
* Dropped the chcr patch as they have already removed the code that
  needed to be changed

I'm still hoping to only get Patch 1 in the series merged. (Any
volunteers?) I'm willing to chase down the maintainers for the remaining
patches separately after the first patch is in.

The patchset is based on next-20170424 and can be found in the sg_map_v2
branch from this git tree:

https://github.com/sbates130272/linux-p2pmem.git

--

Hi Everyone,

As part of my effort to enable P2P DMA transactions with PCI cards,
we've identified the need to be able to safely put IO memory into
scatterlists (and eventually other spots). This probably involves a
conversion from struct page to pfn_t but that migration is a ways off
and those decisions are yet to be made.

As an initial step in that direction, I've started cleaning up some of the
scatterlist code by trying to carve out a better defined layer between it
and it's users. The longer term goal would be to remove sg_page or replace
it with something that can potentially fail.

This patchset is the first step in that effort. I've introduced
a common function to map scatterlist memory and converted all the common
kmap(sg_page()) cases. This removes about 66 sg_page calls (of ~331).

Seeing this is a fairly large cleanup set that touches a wide swath of
the kernel I have limited the people I've sent this to. I'd suggest we look
toward merging the first patch and then I can send the individual subsystem
patches on to their respective maintainers and get them merged
independantly. (This is to avoid the conflicts I created with my last
cleanup set... Sorry) Though, I'm certainly open to other suggestions to get
it merged.

Logan Gunthorpe (21):
  scatterlist: Introduce sg_map helper functions
  libiscsi: Add an internal error code
  libiscsi: Make use of new the sg_map helper function
  target: Make use of the new sg_map function at 16 call sites
  drm/i915: Make use of the new sg_map helper function
  crypto: hifn_795x: Make use of the new sg_map helper function
  crypto: shash, caam: Make use of the new sg_map helper function
  dm-crypt: Make use of the new sg_map helper in 4 call sites
  staging: unisys: visorbus: Make use of the new sg_map helper function
  RDS: Make use of the new sg_map helper function
  scsi: ipr, pmcraid, isci: Make use of the new sg_map helper
  scsi: hisi_sas, mvsas, gdth: Make use of the new sg_map helper
function
  scsi: arcmsr, ips, megaraid: Make use of the new sg_map helper
function
  scsi: libfc, csiostor: Change to sg_copy_buffer in two drivers
  xen-blkfront: Make use of the new sg_map helper function
  mmc: sdhci: Make use of the new sg_map helper function
  mmc: spi: Make use of the new sg_map helper function
  mmc: tmio: Make use of the new sg_map helper function
  mmc: sdricoh_cs: Make use of the new sg_map helper function
  mmc: tifm_sd: Make use of the new sg_map helper function
  memstick: Make use of the new sg_map helper function

 crypto/shash.c  |   9 ++-
 drivers/block/xen-blkfront.c|  20 ++---
 drivers/crypto/caam/caamalg.c   |   8 +-
 drivers/crypto/hifn_795x.c  |  32 +---
 drivers/gpu/drm/i915/i915_gem.c |  27 ---
 drivers/md/dm-crypt.c   |  39 ++---
 drivers/memstick/host/jmb38x_ms.c   |  11 +--
 drivers/memstick/host/tifm_ms.c |  11 +--
 drivers/mmc/host/mmc_spi.c  |  26 --
 drivers/mmc/host/sdhci.c|  14 ++--
 drivers/mmc/host/sdricoh_cs.c   |  14 ++--
 drivers/mmc/host/tifm_sd.c  |  50 +++-
 drivers/mmc/host/tmio_mmc.h |   7 +-
 drivers/mmc/host/tmio_mmc_pio.c |  12 +++
 drivers/scsi/arcmsr/arcmsr_hba.c|  16 +++-
 drivers/scsi/csiostor/csio_scsi.c   |  54 +
 drivers/scsi/cxgbi/libcxgbi.c   |   5 ++
 drivers/scsi/gdth.c |   9 ++-
 drivers/scsi/hisi_sas/hisi_sas_v1_hw.c  |  14 ++--
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c  |  13 ++-
 drivers/scsi/ipr.c  |  27 ---
 drivers/scsi/ips.c  |   8 +-
 drivers

[PATCH v2 14/21] scsi: libfc, csiostor: Change to sg_copy_buffer in two drivers

2017-04-25 Thread Logan Gunthorpe
These two drivers appear to duplicate the functionality of
sg_copy_buffer. So we clean them up to use the common code.

This helps us remove a couple of instances that would otherwise be
slightly tricky sg_map usages.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Johannes Thumshirn <j...@kernel.org>
---
 drivers/scsi/csiostor/csio_scsi.c | 54 +++
 drivers/scsi/libfc/fc_libfc.c | 49 ---
 2 files changed, 14 insertions(+), 89 deletions(-)

diff --git a/drivers/scsi/csiostor/csio_scsi.c 
b/drivers/scsi/csiostor/csio_scsi.c
index a1ff75f..bd9d062 100644
--- a/drivers/scsi/csiostor/csio_scsi.c
+++ b/drivers/scsi/csiostor/csio_scsi.c
@@ -1489,60 +1489,14 @@ static inline uint32_t
 csio_scsi_copy_to_sgl(struct csio_hw *hw, struct csio_ioreq *req)
 {
struct scsi_cmnd *scmnd  = (struct scsi_cmnd *)csio_scsi_cmnd(req);
-   struct scatterlist *sg;
-   uint32_t bytes_left;
-   uint32_t bytes_copy;
-   uint32_t buf_off = 0;
-   uint32_t start_off = 0;
-   uint32_t sg_off = 0;
-   void *sg_addr;
-   void *buf_addr;
struct csio_dma_buf *dma_buf;
+   size_t copied;
 
-   bytes_left = scsi_bufflen(scmnd);
-   sg = scsi_sglist(scmnd);
dma_buf = (struct csio_dma_buf *)csio_list_next(>gen_list);
+   copied = sg_copy_from_buffer(scsi_sglist(scmnd), scsi_sg_count(scmnd),
+dma_buf->vaddr, scsi_bufflen(scmnd));
 
-   /* Copy data from driver buffer to SGs of SCSI CMD */
-   while (bytes_left > 0 && sg && dma_buf) {
-   if (buf_off >= dma_buf->len) {
-   buf_off = 0;
-   dma_buf = (struct csio_dma_buf *)
-   csio_list_next(dma_buf);
-   continue;
-   }
-
-   if (start_off >= sg->length) {
-   start_off -= sg->length;
-   sg = sg_next(sg);
-   continue;
-   }
-
-   buf_addr = dma_buf->vaddr + buf_off;
-   sg_off = sg->offset + start_off;
-   bytes_copy = min((dma_buf->len - buf_off),
-   sg->length - start_off);
-   bytes_copy = min((uint32_t)(PAGE_SIZE - (sg_off & ~PAGE_MASK)),
-bytes_copy);
-
-   sg_addr = kmap_atomic(sg_page(sg) + (sg_off >> PAGE_SHIFT));
-   if (!sg_addr) {
-   csio_err(hw, "failed to kmap sg:%p of ioreq:%p\n",
-   sg, req);
-   break;
-   }
-
-   csio_dbg(hw, "copy_to_sgl:sg_addr %p sg_off %d buf %p len %d\n",
-   sg_addr, sg_off, buf_addr, bytes_copy);
-   memcpy(sg_addr + (sg_off & ~PAGE_MASK), buf_addr, bytes_copy);
-   kunmap_atomic(sg_addr);
-
-   start_off +=  bytes_copy;
-   buf_off += bytes_copy;
-   bytes_left -= bytes_copy;
-   }
-
-   if (bytes_left > 0)
+   if (copied != scsi_bufflen(scmnd))
return DID_ERROR;
else
return DID_OK;
diff --git a/drivers/scsi/libfc/fc_libfc.c b/drivers/scsi/libfc/fc_libfc.c
index d623d08..ce0805a 100644
--- a/drivers/scsi/libfc/fc_libfc.c
+++ b/drivers/scsi/libfc/fc_libfc.c
@@ -113,45 +113,16 @@ u32 fc_copy_buffer_to_sglist(void *buf, size_t len,
 u32 *nents, size_t *offset,
 u32 *crc)
 {
-   size_t remaining = len;
-   u32 copy_len = 0;
-
-   while (remaining > 0 && sg) {
-   size_t off, sg_bytes;
-   void *page_addr;
-
-   if (*offset >= sg->length) {
-   /*
-* Check for end and drop resources
-* from the last iteration.
-*/
-   if (!(*nents))
-   break;
-   --(*nents);
-   *offset -= sg->length;
-   sg = sg_next(sg);
-   continue;
-   }
-   sg_bytes = min(remaining, sg->length - *offset);
-
-   /*
-* The scatterlist item may be bigger than PAGE_SIZE,
-* but we are limited to mapping PAGE_SIZE at a time.
-*/
-   off = *offset + sg->offset;
-   sg_bytes = min(sg_bytes,
-  (size_t)(PAGE_SIZE - (off & ~PAGE_MASK)));
-   page_addr = kmap_atomic(sg_page(sg) + (off >> PAGE_SHIFT));
-   if (crc)
-   *crc = crc32(*crc, buf, sg_bytes);
-   memcpy((char *)page_addr + (off &

[PATCH v2 16/21] mmc: sdhci: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
Straightforward conversion, except due to the lack of an error path we
have to use SG_MAP_MUST_NOT_FAIL which may BUG_ON in certain cases
in the future.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Adrian Hunter <adrian.hun...@intel.com>
Cc: Ulf Hansson <ulf.hans...@linaro.org>
---
 drivers/mmc/host/sdhci.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index ecd0d43..239507f 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -513,15 +513,19 @@ static int sdhci_pre_dma_transfer(struct sdhci_host *host,
return sg_count;
 }
 
+/*
+ * Note this function may return PTR_ERR and must be checked.
+ */
 static char *sdhci_kmap_atomic(struct scatterlist *sg, unsigned long *flags)
 {
local_irq_save(*flags);
-   return kmap_atomic(sg_page(sg)) + sg->offset;
+   return sg_map(sg, 0, SG_KMAP_ATOMIC | SG_MAP_MUST_NOT_FAIL);
 }
 
-static void sdhci_kunmap_atomic(void *buffer, unsigned long *flags)
+static void sdhci_kunmap_atomic(struct scatterlist *sg, void *buffer,
+   unsigned long *flags)
 {
-   kunmap_atomic(buffer);
+   sg_unmap(sg, buffer, 0, SG_KMAP_ATOMIC);
local_irq_restore(*flags);
 }
 
@@ -585,7 +589,7 @@ static void sdhci_adma_table_pre(struct sdhci_host *host,
if (data->flags & MMC_DATA_WRITE) {
buffer = sdhci_kmap_atomic(sg, );
memcpy(align, buffer, offset);
-   sdhci_kunmap_atomic(buffer, );
+   sdhci_kunmap_atomic(sg, buffer, );
}
 
/* tran, valid */
@@ -663,7 +667,7 @@ static void sdhci_adma_table_post(struct sdhci_host *host,
 
buffer = sdhci_kmap_atomic(sg, );
memcpy(buffer, align, size);
-   sdhci_kunmap_atomic(buffer, );
+   sdhci_kunmap_atomic(sg, buffer, );
 
align += SDHCI_ADMA2_ALIGN;
}
-- 
2.1.4



[PATCH v2 18/21] mmc: tmio: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
Straightforward conversion to sg_map helper. Seeing there is no
cleare error path, SG_MAP_MUST_NOT_FAIL which may BUG_ON in certain
cases in the future.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Wolfram Sang <wsa+rene...@sang-engineering.com>
Cc: Ulf Hansson <ulf.hans...@linaro.org>
---
 drivers/mmc/host/tmio_mmc.h |  7 +--
 drivers/mmc/host/tmio_mmc_pio.c | 12 
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/tmio_mmc.h b/drivers/mmc/host/tmio_mmc.h
index d0edb57..bc43eb0 100644
--- a/drivers/mmc/host/tmio_mmc.h
+++ b/drivers/mmc/host/tmio_mmc.h
@@ -202,17 +202,20 @@ void tmio_mmc_enable_mmc_irqs(struct tmio_mmc_host *host, 
u32 i);
 void tmio_mmc_disable_mmc_irqs(struct tmio_mmc_host *host, u32 i);
 irqreturn_t tmio_mmc_irq(int irq, void *devid);
 
+/* Note: this function may return PTR_ERR and must be checked! */
 static inline char *tmio_mmc_kmap_atomic(struct scatterlist *sg,
 unsigned long *flags)
 {
+   void *ret;
+
local_irq_save(*flags);
-   return kmap_atomic(sg_page(sg)) + sg->offset;
+   return sg_map(sg, 0, SG_KMAP_ATOMIC | SG_MAP_MUST_NOT_FAIL);
 }
 
 static inline void tmio_mmc_kunmap_atomic(struct scatterlist *sg,
  unsigned long *flags, void *virt)
 {
-   kunmap_atomic(virt - sg->offset);
+   sg_unmap(sg, virt, 0, SG_KMAP_ATOMIC);
local_irq_restore(*flags);
 }
 
diff --git a/drivers/mmc/host/tmio_mmc_pio.c b/drivers/mmc/host/tmio_mmc_pio.c
index a2d92f1..bbb4f19 100644
--- a/drivers/mmc/host/tmio_mmc_pio.c
+++ b/drivers/mmc/host/tmio_mmc_pio.c
@@ -506,6 +506,18 @@ static void tmio_mmc_check_bounce_buffer(struct 
tmio_mmc_host *host)
if (host->sg_ptr == >bounce_sg) {
unsigned long flags;
void *sg_vaddr = tmio_mmc_kmap_atomic(host->sg_orig, );
+   if (IS_ERR(sg_vaddr)) {
+   /*
+* This should really never happen unless
+* the code is changed to use memory that is
+* not mappable in the sg. Seeing there doesn't
+* seem to be any error path out of here,
+* we can only WARN.
+*/
+   WARN(1, "Non-mappable memory used in sg!");
+   return;
+   }
+
memcpy(sg_vaddr, host->bounce_buf, host->bounce_sg.length);
tmio_mmc_kunmap_atomic(host->sg_orig, , sg_vaddr);
}
-- 
2.1.4



[PATCH v2 03/21] libiscsi: Make use of new the sg_map helper function

2017-04-25 Thread Logan Gunthorpe
Convert the kmap and kmap_atomic uses to the sg_map function. We now
store the flags for the kmap instead of a boolean to indicate
atomicitiy. We use ISCSI_TCP_INTERNAL_ERR error type that was prepared
earlier for this.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Lee Duncan <ldun...@suse.com>
Cc: Chris Leech <cle...@redhat.com>
---
 drivers/scsi/libiscsi_tcp.c | 32 
 include/scsi/libiscsi_tcp.h |  2 +-
 2 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c
index 63a1d69..a34e25c 100644
--- a/drivers/scsi/libiscsi_tcp.c
+++ b/drivers/scsi/libiscsi_tcp.c
@@ -133,25 +133,23 @@ static void iscsi_tcp_segment_map(struct iscsi_segment 
*segment, int recv)
if (page_count(sg_page(sg)) >= 1 && !recv)
return;
 
-   if (recv) {
-   segment->atomic_mapped = true;
-   segment->sg_mapped = kmap_atomic(sg_page(sg));
-   } else {
-   segment->atomic_mapped = false;
-   /* the xmit path can sleep with the page mapped so use kmap */
-   segment->sg_mapped = kmap(sg_page(sg));
+   /* the xmit path can sleep with the page mapped so don't use atomic */
+   segment->sg_map_flags = recv ? SG_KMAP_ATOMIC : SG_KMAP;
+   segment->sg_mapped = sg_map(sg, 0, segment->sg_map_flags);
+
+   if (IS_ERR(segment->sg_mapped)) {
+   segment->sg_mapped = NULL;
+   return;
}
 
-   segment->data = segment->sg_mapped + sg->offset + segment->sg_offset;
+   segment->data = segment->sg_mapped + segment->sg_offset;
 }
 
 void iscsi_tcp_segment_unmap(struct iscsi_segment *segment)
 {
if (segment->sg_mapped) {
-   if (segment->atomic_mapped)
-   kunmap_atomic(segment->sg_mapped);
-   else
-   kunmap(sg_page(segment->sg));
+   sg_unmap(segment->sg, segment->sg_mapped, 0,
+segment->sg_map_flags);
segment->sg_mapped = NULL;
segment->data = NULL;
}
@@ -304,6 +302,9 @@ iscsi_tcp_segment_recv(struct iscsi_tcp_conn *tcp_conn,
break;
}
 
+   if (segment->data)
+   return -EFAULT;
+
copy = min(len - copied, segment->size - segment->copied);
ISCSI_DBG_TCP(tcp_conn->iscsi_conn, "copying %d\n", copy);
memcpy(segment->data + segment->copied, ptr + copied, copy);
@@ -927,6 +928,13 @@ int iscsi_tcp_recv_skb(struct iscsi_conn *conn, struct 
sk_buff *skb,
  avail);
rc = iscsi_tcp_segment_recv(tcp_conn, segment, ptr, avail);
BUG_ON(rc == 0);
+   if (rc < 0) {
+   ISCSI_DBG_TCP(conn, "memory fault. Consumed %d\n",
+ consumed);
+   *status = ISCSI_TCP_INTERNAL_ERR;
+   goto skb_done;
+   }
+
consumed += rc;
 
if (segment->total_copied >= segment->total_size) {
diff --git a/include/scsi/libiscsi_tcp.h b/include/scsi/libiscsi_tcp.h
index 90691ad..58c79af 100644
--- a/include/scsi/libiscsi_tcp.h
+++ b/include/scsi/libiscsi_tcp.h
@@ -47,7 +47,7 @@ struct iscsi_segment {
struct scatterlist  *sg;
void*sg_mapped;
unsigned intsg_offset;
-   boolatomic_mapped;
+   int sg_map_flags;
 
iscsi_segment_done_fn_t *done;
 };
-- 
2.1.4



[PATCH v2 09/21] staging: unisys: visorbus: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
Straightforward conversion to the new function.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Acked-by: David Kershner <david.kersh...@unisys.com>
---
 drivers/staging/unisys/visorhba/visorhba_main.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/unisys/visorhba/visorhba_main.c 
b/drivers/staging/unisys/visorhba/visorhba_main.c
index d372115..c77426c 100644
--- a/drivers/staging/unisys/visorhba/visorhba_main.c
+++ b/drivers/staging/unisys/visorhba/visorhba_main.c
@@ -843,7 +843,6 @@ do_scsi_nolinuxstat(struct uiscmdrsp *cmdrsp, struct 
scsi_cmnd *scsicmd)
struct scatterlist *sg;
unsigned int i;
char *this_page;
-   char *this_page_orig;
int bufind = 0;
struct visordisk_info *vdisk;
struct visorhba_devdata *devdata;
@@ -870,11 +869,14 @@ do_scsi_nolinuxstat(struct uiscmdrsp *cmdrsp, struct 
scsi_cmnd *scsicmd)
 
sg = scsi_sglist(scsicmd);
for (i = 0; i < scsi_sg_count(scsicmd); i++) {
-   this_page_orig = kmap_atomic(sg_page(sg + i));
-   this_page = (void *)((unsigned long)this_page_orig |
-sg[i].offset);
+   this_page = sg_map(sg + i, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(this_page)) {
+   scsicmd->result = DID_ERROR << 16;
+   return;
+   }
+
memcpy(this_page, buf + bufind, sg[i].length);
-   kunmap_atomic(this_page_orig);
+   sg_unmap(sg + i, this_page, 0, SG_KMAP_ATOMIC);
}
} else {
devdata = (struct visorhba_devdata *)scsidev->host->hostdata;
-- 
2.1.4



[PATCH v2 21/21] memstick: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
Straightforward conversion, but we have to make use of
SG_MAP_MUST_NOT_FAIL which may BUG_ON in certain cases
in the future.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Alex Dubov <oa...@yahoo.com>
---
 drivers/memstick/host/jmb38x_ms.c | 11 ++-
 drivers/memstick/host/tifm_ms.c   | 11 ++-
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/memstick/host/jmb38x_ms.c 
b/drivers/memstick/host/jmb38x_ms.c
index 48db922..9019e37 100644
--- a/drivers/memstick/host/jmb38x_ms.c
+++ b/drivers/memstick/host/jmb38x_ms.c
@@ -303,7 +303,6 @@ static int jmb38x_ms_transfer_data(struct jmb38x_ms_host 
*host)
unsigned int off;
unsigned int t_size, p_cnt;
unsigned char *buf;
-   struct page *pg;
unsigned long flags = 0;
 
if (host->req->long_data) {
@@ -318,14 +317,14 @@ static int jmb38x_ms_transfer_data(struct jmb38x_ms_host 
*host)
unsigned int uninitialized_var(p_off);
 
if (host->req->long_data) {
-   pg = nth_page(sg_page(>req->sg),
- off >> PAGE_SHIFT);
p_off = offset_in_page(off);
p_cnt = PAGE_SIZE - p_off;
p_cnt = min(p_cnt, length);
 
local_irq_save(flags);
-   buf = kmap_atomic(pg) + p_off;
+   buf = sg_map(>req->sg,
+off - host->req->sg.offset,
+SG_KMAP_ATOMIC | SG_MAP_MUST_NOT_FAIL);
} else {
buf = host->req->data + host->block_pos;
p_cnt = host->req->data_len - host->block_pos;
@@ -341,7 +340,9 @@ static int jmb38x_ms_transfer_data(struct jmb38x_ms_host 
*host)
 : jmb38x_ms_read_reg_data(host, buf, p_cnt);
 
if (host->req->long_data) {
-   kunmap_atomic(buf - p_off);
+   sg_unmap(>req->sg, buf,
+off - host->req->sg.offset,
+SG_KMAP_ATOMIC);
local_irq_restore(flags);
}
 
diff --git a/drivers/memstick/host/tifm_ms.c b/drivers/memstick/host/tifm_ms.c
index 7bafa72..304985d 100644
--- a/drivers/memstick/host/tifm_ms.c
+++ b/drivers/memstick/host/tifm_ms.c
@@ -186,7 +186,6 @@ static unsigned int tifm_ms_transfer_data(struct tifm_ms 
*host)
unsigned int off;
unsigned int t_size, p_cnt;
unsigned char *buf;
-   struct page *pg;
unsigned long flags = 0;
 
if (host->req->long_data) {
@@ -203,14 +202,14 @@ static unsigned int tifm_ms_transfer_data(struct tifm_ms 
*host)
unsigned int uninitialized_var(p_off);
 
if (host->req->long_data) {
-   pg = nth_page(sg_page(>req->sg),
- off >> PAGE_SHIFT);
p_off = offset_in_page(off);
p_cnt = PAGE_SIZE - p_off;
p_cnt = min(p_cnt, length);
 
local_irq_save(flags);
-   buf = kmap_atomic(pg) + p_off;
+   buf = sg_map(>req->sg,
+off - host->req->sg.offset,
+SG_KMAP_ATOMIC | SG_MAP_MUST_NOT_FAIL);
} else {
buf = host->req->data + host->block_pos;
p_cnt = host->req->data_len - host->block_pos;
@@ -221,7 +220,9 @@ static unsigned int tifm_ms_transfer_data(struct tifm_ms 
*host)
 : tifm_ms_read_data(host, buf, p_cnt);
 
if (host->req->long_data) {
-   kunmap_atomic(buf - p_off);
+   sg_unmap(>req->sg, buf,
+off - host->req->sg.offset,
+SG_KMAP_ATOMIC | SG_MAP_MUST_NOT_FAIL);
local_irq_restore(flags);
}
 
-- 
2.1.4



[PATCH v2 17/21] mmc: spi: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
We use the sg_map helper but it's slightly more complicated
as we only check for the error when the mapping actually gets used.
Such that if the mapping failed but wasn't needed then no
error occurs.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Ulf Hansson <ulf.hans...@linaro.org>
---
 drivers/mmc/host/mmc_spi.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/mmc/host/mmc_spi.c b/drivers/mmc/host/mmc_spi.c
index 476e53d..d614f36 100644
--- a/drivers/mmc/host/mmc_spi.c
+++ b/drivers/mmc/host/mmc_spi.c
@@ -676,9 +676,15 @@ mmc_spi_writeblock(struct mmc_spi_host *host, struct 
spi_transfer *t,
struct scratch  *scratch = host->data;
u32 pattern;
 
-   if (host->mmc->use_spi_crc)
+   if (host->mmc->use_spi_crc) {
+   if (IS_ERR(t->tx_buf))
+   return PTR_ERR(t->tx_buf);
+
scratch->crc_val = cpu_to_be16(
crc_itu_t(0, t->tx_buf, t->len));
+   t->tx_buf += t->len;
+   }
+
if (host->dma_dev)
dma_sync_single_for_device(host->dma_dev,
host->data_dma, sizeof(*scratch),
@@ -743,7 +749,6 @@ mmc_spi_writeblock(struct mmc_spi_host *host, struct 
spi_transfer *t,
return status;
}
 
-   t->tx_buf += t->len;
if (host->dma_dev)
t->tx_dma += t->len;
 
@@ -809,6 +814,11 @@ mmc_spi_readblock(struct mmc_spi_host *host, struct 
spi_transfer *t,
}
leftover = status << 1;
 
+   if (bitshift || host->mmc->use_spi_crc) {
+   if (IS_ERR(t->rx_buf))
+   return PTR_ERR(t->rx_buf);
+   }
+
if (host->dma_dev) {
dma_sync_single_for_device(host->dma_dev,
host->data_dma, sizeof(*scratch),
@@ -860,9 +870,10 @@ mmc_spi_readblock(struct mmc_spi_host *host, struct 
spi_transfer *t,
scratch->crc_val, crc, t->len);
return -EILSEQ;
}
+
+   t->rx_buf += t->len;
}
 
-   t->rx_buf += t->len;
if (host->dma_dev)
t->rx_dma += t->len;
 
@@ -933,11 +944,11 @@ mmc_spi_data_do(struct mmc_spi_host *host, struct 
mmc_command *cmd,
}
 
/* allow pio too; we don't allow highmem */
-   kmap_addr = kmap(sg_page(sg));
+   kmap_addr = sg_map(sg, 0, SG_KMAP);
if (direction == DMA_TO_DEVICE)
-   t->tx_buf = kmap_addr + sg->offset;
+   t->tx_buf = kmap_addr;
else
-   t->rx_buf = kmap_addr + sg->offset;
+   t->rx_buf = kmap_addr;
 
/* transfer each block, and update request status */
while (length) {
@@ -967,7 +978,8 @@ mmc_spi_data_do(struct mmc_spi_host *host, struct 
mmc_command *cmd,
/* discard mappings */
if (direction == DMA_FROM_DEVICE)
flush_kernel_dcache_page(sg_page(sg));
-   kunmap(sg_page(sg));
+   if (!IS_ERR(kmap_addr))
+   sg_unmap(sg, kmap_addr, 0, SG_KMAP);
if (dma_dev)
dma_unmap_page(dma_dev, dma_addr, PAGE_SIZE, dir);
 
-- 
2.1.4



[PATCH v2 11/21] scsi: ipr, pmcraid, isci: Make use of the new sg_map helper

2017-04-25 Thread Logan Gunthorpe
Very straightforward conversion of three scsi drivers.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Brian King <brk...@us.ibm.com>
Cc: Artur Paszkiewicz <artur.paszkiew...@intel.com>
---
 drivers/scsi/ipr.c  | 27 ++-
 drivers/scsi/isci/request.c | 42 +-
 drivers/scsi/pmcraid.c  | 19 ---
 3 files changed, 51 insertions(+), 37 deletions(-)

diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index b0c68d2..b2324e1 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -3895,7 +3895,7 @@ static void ipr_free_ucode_buffer(struct ipr_sglist 
*sglist)
 static int ipr_copy_ucode_buffer(struct ipr_sglist *sglist,
 u8 *buffer, u32 len)
 {
-   int bsize_elem, i, result = 0;
+   int bsize_elem, i;
struct scatterlist *scatterlist;
void *kaddr;
 
@@ -3905,32 +3905,33 @@ static int ipr_copy_ucode_buffer(struct ipr_sglist 
*sglist,
scatterlist = sglist->scatterlist;
 
for (i = 0; i < (len / bsize_elem); i++, buffer += bsize_elem) {
-   struct page *page = sg_page([i]);
+   kaddr = sg_map([i], 0, SG_KMAP);
+   if (IS_ERR(kaddr)) {
+   ipr_trace;
+   return PTR_ERR(kaddr);
+   }
 
-   kaddr = kmap(page);
memcpy(kaddr, buffer, bsize_elem);
-   kunmap(page);
+   sg_unmap([i], kaddr, 0, SG_KMAP);
 
scatterlist[i].length = bsize_elem;
-
-   if (result != 0) {
-   ipr_trace;
-   return result;
-   }
}
 
if (len % bsize_elem) {
-   struct page *page = sg_page([i]);
+   kaddr = sg_map([i], 0, SG_KMAP);
+   if (IS_ERR(kaddr)) {
+   ipr_trace;
+   return PTR_ERR(kaddr);
+   }
 
-   kaddr = kmap(page);
memcpy(kaddr, buffer, len % bsize_elem);
-   kunmap(page);
+   sg_unmap([i], kaddr, 0, SG_KMAP);
 
scatterlist[i].length = len % bsize_elem;
}
 
sglist->buffer_len = len;
-   return result;
+   return 0;
 }
 
 /**
diff --git a/drivers/scsi/isci/request.c b/drivers/scsi/isci/request.c
index 47f66e9..6f5521b 100644
--- a/drivers/scsi/isci/request.c
+++ b/drivers/scsi/isci/request.c
@@ -1424,12 +1424,14 @@ sci_stp_request_pio_data_in_copy_data_buffer(struct 
isci_stp_request *stp_req,
sg = task->scatter;
 
while (total_len > 0) {
-   struct page *page = sg_page(sg);
-
copy_len = min_t(int, total_len, sg_dma_len(sg));
-   kaddr = kmap_atomic(page);
-   memcpy(kaddr + sg->offset, src_addr, copy_len);
-   kunmap_atomic(kaddr);
+   kaddr = sg_map(sg, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(kaddr))
+   return SCI_FAILURE;
+
+   memcpy(kaddr, src_addr, copy_len);
+   sg_unmap(sg, kaddr, 0, SG_KMAP_ATOMIC);
+
total_len -= copy_len;
src_addr += copy_len;
sg = sg_next(sg);
@@ -1771,14 +1773,16 @@ sci_io_request_frame_handler(struct isci_request *ireq,
case SCI_REQ_SMP_WAIT_RESP: {
struct sas_task *task = isci_request_access_task(ireq);
struct scatterlist *sg = >smp_task.smp_resp;
-   void *frame_header, *kaddr;
+   void *frame_header;
u8 *rsp;
 
sci_unsolicited_frame_control_get_header(>uf_control,
 frame_index,
 _header);
-   kaddr = kmap_atomic(sg_page(sg));
-   rsp = kaddr + sg->offset;
+   rsp = sg_map(sg, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(rsp))
+   return SCI_FAILURE;
+
sci_swab32_cpy(rsp, frame_header, 1);
 
if (rsp[0] == SMP_RESPONSE) {
@@ -1814,7 +1818,7 @@ sci_io_request_frame_handler(struct isci_request *ireq,
ireq->sci_status = 
SCI_FAILURE_CONTROLLER_SPECIFIC_IO_ERR;
sci_change_state(>sm, SCI_REQ_COMPLETED);
}
-   kunmap_atomic(kaddr);
+   sg_unmap(sg, rsp, 0, SG_KMAP_ATOMIC);
 
sci_controller_release_frame(ihost, frame_index);
 
@@ -2919,15 +2923,18 @@ static void isci_request_io_request_complete(struct 
isci_host *ihost,
case SAS_PROTOCOL_SMP: {
struct scatterlist *sg = >smp_task.smp_req;
struct smp_req *smp_req;
-   void *k

[PATCH v2 06/21] crypto: hifn_795x: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
Conversion of a couple kmap_atomic instances to the sg_map helper
function.

However, it looks like there was a bug in the original code: the source
scatter lists offset (t->offset) was passed to ablkcipher_get which
added it to the destination address. This doesn't make a lot of
sense, but t->offset is likely always zero anyway. So, this patch cleans
that brokeness up.

Also, a change to the error path: if ablkcipher_get failed, everything
seemed to proceed as if it hadn't. Setting 'error' should hopefully
clear that up.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Herbert Xu <herb...@gondor.apana.org.au>
Cc: "David S. Miller" <da...@davemloft.net>
---
 drivers/crypto/hifn_795x.c | 32 +---
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/crypto/hifn_795x.c b/drivers/crypto/hifn_795x.c
index e09d405..34b1870 100644
--- a/drivers/crypto/hifn_795x.c
+++ b/drivers/crypto/hifn_795x.c
@@ -1619,7 +1619,7 @@ static int hifn_start_device(struct hifn_device *dev)
return 0;
 }
 
-static int ablkcipher_get(void *saddr, unsigned int *srestp, unsigned int 
offset,
+static int ablkcipher_get(void *saddr, unsigned int *srestp,
struct scatterlist *dst, unsigned int size, unsigned int 
*nbytesp)
 {
unsigned int srest = *srestp, nbytes = *nbytesp, copy;
@@ -1632,15 +1632,17 @@ static int ablkcipher_get(void *saddr, unsigned int 
*srestp, unsigned int offset
while (size) {
copy = min3(srest, dst->length, size);
 
-   daddr = kmap_atomic(sg_page(dst));
-   memcpy(daddr + dst->offset + offset, saddr, copy);
-   kunmap_atomic(daddr);
+   daddr = sg_map(dst, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(daddr))
+   return PTR_ERR(daddr);
+
+   memcpy(daddr, saddr, copy);
+   sg_unmap(dst, daddr, 0, SG_KMAP_ATOMIC);
 
nbytes -= copy;
size -= copy;
srest -= copy;
saddr += copy;
-   offset = 0;
 
pr_debug("%s: copy: %u, size: %u, srest: %u, nbytes: %u.\n",
 __func__, copy, size, srest, nbytes);
@@ -1671,11 +1673,12 @@ static inline void hifn_complete_sa(struct hifn_device 
*dev, int i)
 
 static void hifn_process_ready(struct ablkcipher_request *req, int error)
 {
+   int err;
struct hifn_request_context *rctx = ablkcipher_request_ctx(req);
 
if (rctx->walk.flags & ASYNC_FLAGS_MISALIGNED) {
unsigned int nbytes = req->nbytes;
-   int idx = 0, err;
+   int idx = 0;
struct scatterlist *dst, *t;
void *saddr;
 
@@ -1695,17 +1698,24 @@ static void hifn_process_ready(struct 
ablkcipher_request *req, int error)
continue;
}
 
-   saddr = kmap_atomic(sg_page(t));
+   saddr = sg_map(t, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(saddr)) {
+   if (!error)
+   error = PTR_ERR(saddr);
+   break;
+   }
+
+   err = ablkcipher_get(saddr, >length,
+dst, nbytes, );
+   sg_unmap(t, saddr, 0, SG_KMAP_ATOMIC);
 
-   err = ablkcipher_get(saddr, >length, t->offset,
-   dst, nbytes, );
if (err < 0) {
-   kunmap_atomic(saddr);
+   if (!error)
+   error = err;
break;
}
 
idx += err;
-   kunmap_atomic(saddr);
}
 
hifn_cipher_walk_exit(>walk);
-- 
2.1.4



[PATCH v2 15/21] xen-blkfront: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
Straightforward conversion to the new helper, except due to the lack
of error path, we have to use SG_MAP_MUST_NOT_FAIL which may BUG_ON in
certain cases in the future.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Boris Ostrovsky <boris.ostrov...@oracle.com>
Cc: Juergen Gross <jgr...@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.w...@oracle.com>
Cc: "Roger Pau Monné" <roger@citrix.com>
---
 drivers/block/xen-blkfront.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 3945963..ed62175 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -816,8 +816,9 @@ static int blkif_queue_rw_req(struct request *req, struct 
blkfront_ring_info *ri
BUG_ON(sg->offset + sg->length > PAGE_SIZE);
 
if (setup.need_copy) {
-   setup.bvec_off = sg->offset;
-   setup.bvec_data = kmap_atomic(sg_page(sg));
+   setup.bvec_off = 0;
+   setup.bvec_data = sg_map(sg, 0, SG_KMAP_ATOMIC |
+SG_MAP_MUST_NOT_FAIL);
}
 
gnttab_foreach_grant_in_range(sg_page(sg),
@@ -827,7 +828,7 @@ static int blkif_queue_rw_req(struct request *req, struct 
blkfront_ring_info *ri
  );
 
if (setup.need_copy)
-   kunmap_atomic(setup.bvec_data);
+   sg_unmap(sg, setup.bvec_data, 0, SG_KMAP_ATOMIC);
}
if (setup.segments)
kunmap_atomic(setup.segments);
@@ -1053,7 +1054,7 @@ static int xen_translate_vdev(int vdevice, int *minor, 
unsigned int *offset)
case XEN_SCSI_DISK5_MAJOR:
case XEN_SCSI_DISK6_MAJOR:
case XEN_SCSI_DISK7_MAJOR:
-   *offset = (*minor / PARTS_PER_DISK) + 
+   *offset = (*minor / PARTS_PER_DISK) +
((major - XEN_SCSI_DISK1_MAJOR + 1) * 16) +
EMULATED_SD_DISK_NAME_OFFSET;
*minor = *minor +
@@ -1068,7 +1069,7 @@ static int xen_translate_vdev(int vdevice, int *minor, 
unsigned int *offset)
case XEN_SCSI_DISK13_MAJOR:
case XEN_SCSI_DISK14_MAJOR:
case XEN_SCSI_DISK15_MAJOR:
-   *offset = (*minor / PARTS_PER_DISK) + 
+   *offset = (*minor / PARTS_PER_DISK) +
((major - XEN_SCSI_DISK8_MAJOR + 8) * 16) +
EMULATED_SD_DISK_NAME_OFFSET;
*minor = *minor +
@@ -1119,7 +1120,7 @@ static int xlvbd_alloc_gendisk(blkif_sector_t capacity,
if (!VDEV_IS_EXTENDED(info->vdevice)) {
err = xen_translate_vdev(info->vdevice, , );
if (err)
-   return err; 
+   return err;
nr_parts = PARTS_PER_DISK;
} else {
minor = BLKIF_MINOR_EXT(info->vdevice);
@@ -1483,8 +1484,9 @@ static bool blkif_completion(unsigned long *id,
for_each_sg(s->sg, sg, num_sg, i) {
BUG_ON(sg->offset + sg->length > PAGE_SIZE);
 
-   data.bvec_offset = sg->offset;
-   data.bvec_data = kmap_atomic(sg_page(sg));
+   data.bvec_offset = 0;
+   data.bvec_data = sg_map(sg, 0, SG_KMAP_ATOMIC |
+   SG_MAP_MUST_NOT_FAIL);
 
gnttab_foreach_grant_in_range(sg_page(sg),
  sg->offset,
@@ -1492,7 +1494,7 @@ static bool blkif_completion(unsigned long *id,
  blkif_copy_from_grant,
  );
 
-   kunmap_atomic(data.bvec_data);
+   sg_unmap(sg, data.bvec_data, 0, SG_KMAP_ATOMIC);
}
}
/* Add the persistent grant into the list of free grants */
-- 
2.1.4



[PATCH v2 12/21] scsi: hisi_sas, mvsas, gdth: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
Very straightforward conversion of three scsi drivers.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Achim Leubner <achim_leub...@adaptec.com>
Cc: John Garry <john.ga...@huawei.com>
---
 drivers/scsi/gdth.c|  9 +++--
 drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 14 +-
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 13 +
 drivers/scsi/mvsas/mv_sas.c| 10 +-
 4 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index d020a13..c70248a2 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -2301,10 +2301,15 @@ static void gdth_copy_internal_data(gdth_ha_str *ha, 
Scsi_Cmnd *scp,
 return;
 }
 local_irq_save(flags);
-address = kmap_atomic(sg_page(sl)) + sl->offset;
+address = sg_map(sl, 0, SG_KMAP_ATOMIC);
+if (IS_ERR(address)) {
+scp->result = DID_ERROR << 16;
+return;
+   }
+
 memcpy(address, buffer, cpnow);
 flush_dcache_page(sg_page(sl));
-kunmap_atomic(address);
+sg_unmap(sl, address, 0, SG_KMAP_ATOMIC);
 local_irq_restore(flags);
 if (cpsum == cpcount)
 break;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
index fc1c1b2..b3953e3 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
@@ -1381,18 +1381,22 @@ static int slot_complete_v1_hw(struct hisi_hba 
*hisi_hba,
void *to;
struct scatterlist *sg_resp = >smp_task.smp_resp;
 
-   ts->stat = SAM_STAT_GOOD;
-   to = kmap_atomic(sg_page(sg_resp));
+   to = sg_map(sg_resp, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(to)) {
+   dev_err(dev, "slot complete: error mapping memory");
+   ts->stat = SAS_SG_ERR;
+   break;
+   }
 
+   ts->stat = SAM_STAT_GOOD;
dma_unmap_sg(dev, >smp_task.smp_resp, 1,
 DMA_FROM_DEVICE);
dma_unmap_sg(dev, >smp_task.smp_req, 1,
 DMA_TO_DEVICE);
-   memcpy(to + sg_resp->offset,
-  slot->status_buffer +
+   memcpy(to, slot->status_buffer +
   sizeof(struct hisi_sas_err_record),
   sg_dma_len(sg_resp));
-   kunmap_atomic(to);
+   sg_unmap(sg_resp, to, 0, SG_KMAP_ATOMIC);
break;
}
case SAS_PROTOCOL_SATA:
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index e241921..3e674a4 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -2307,18 +2307,23 @@ slot_complete_v2_hw(struct hisi_hba *hisi_hba, struct 
hisi_sas_slot *slot)
struct scatterlist *sg_resp = >smp_task.smp_resp;
void *to;
 
+   to = sg_map(sg_resp, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(to)) {
+   dev_err(dev, "slot complete: error mapping memory");
+   ts->stat = SAS_SG_ERR;
+   break;
+   }
+
ts->stat = SAM_STAT_GOOD;
-   to = kmap_atomic(sg_page(sg_resp));
 
dma_unmap_sg(dev, >smp_task.smp_resp, 1,
 DMA_FROM_DEVICE);
dma_unmap_sg(dev, >smp_task.smp_req, 1,
 DMA_TO_DEVICE);
-   memcpy(to + sg_resp->offset,
-  slot->status_buffer +
+   memcpy(to, slot->status_buffer +
   sizeof(struct hisi_sas_err_record),
   sg_dma_len(sg_resp));
-   kunmap_atomic(to);
+   sg_unmap(sg_resp, to, 0, SG_KMAP_ATOMIC);
break;
}
case SAS_PROTOCOL_SATA:
diff --git a/drivers/scsi/mvsas/mv_sas.c b/drivers/scsi/mvsas/mv_sas.c
index c7cc803..a72e0ce 100644
--- a/drivers/scsi/mvsas/mv_sas.c
+++ b/drivers/scsi/mvsas/mv_sas.c
@@ -1798,11 +1798,11 @@ int mvs_slot_complete(struct mvs_info *mvi, u32 
rx_desc, u32 flags)
case SAS_PROTOCOL_SMP: {
struct scatterlist *sg_resp = >smp_task.smp_resp;
tstat->stat = SAM_STAT_GOOD;
-   to = kmap_atomic(sg_page(sg_resp));
-   memcpy(to + sg_resp->offset,
-   slot->response + sizeof(struct mvs_err_info),
-   sg_dma_len(sg_resp));
-   kunmap_atomic(to);
+   to = sg_map(sg_resp, 0, SG_KMAP_ATOMI

[PATCH v2 13/21] scsi: arcmsr, ips, megaraid: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
Very straightforward conversion of three scsi drivers

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Adaptec OEM Raid Solutions <aacr...@adaptec.com>
Cc: Kashyap Desai <kashyap.de...@broadcom.com>
Cc: Sumit Saxena <sumit.sax...@broadcom.com>
Cc: Shivasharan S <shivasharan.srikanteshw...@broadcom.com>
---
 drivers/scsi/arcmsr/arcmsr_hba.c | 16 
 drivers/scsi/ips.c   |  8 
 drivers/scsi/megaraid.c  |  9 +++--
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index af032c4..8c2de17 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -2306,7 +2306,10 @@ static int arcmsr_iop_message_xfer(struct 
AdapterControlBlock *acb,
 
use_sg = scsi_sg_count(cmd);
sg = scsi_sglist(cmd);
-   buffer = kmap_atomic(sg_page(sg)) + sg->offset;
+   buffer = sg_map(sg, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(buffer))
+   return ARCMSR_MESSAGE_FAIL;
+
if (use_sg > 1) {
retvalue = ARCMSR_MESSAGE_FAIL;
goto message_out;
@@ -2539,7 +2542,7 @@ static int arcmsr_iop_message_xfer(struct 
AdapterControlBlock *acb,
 message_out:
if (use_sg) {
struct scatterlist *sg = scsi_sglist(cmd);
-   kunmap_atomic(buffer - sg->offset);
+   sg_unmap(sg, buffer, 0, SG_KMAP_ATOMIC);
}
return retvalue;
 }
@@ -2590,11 +2593,16 @@ static void arcmsr_handle_virtual_command(struct 
AdapterControlBlock *acb,
strncpy([32], "R001", 4); /* Product Revision */
 
sg = scsi_sglist(cmd);
-   buffer = kmap_atomic(sg_page(sg)) + sg->offset;
+   buffer = sg_map(sg, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(buffer)) {
+   cmd->result = (DID_ERROR << 16);
+   cmd->scsi_done(cmd);
+   return;
+   }
 
memcpy(buffer, inqdata, sizeof(inqdata));
sg = scsi_sglist(cmd);
-   kunmap_atomic(buffer - sg->offset);
+   sg_unmap(sg, buffer, 0, SG_KMAP_ATOMIC);
 
cmd->scsi_done(cmd);
}
diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
index 3419e1b..6e91729 100644
--- a/drivers/scsi/ips.c
+++ b/drivers/scsi/ips.c
@@ -1506,14 +1506,14 @@ static int ips_is_passthru(struct scsi_cmnd *SC)
 /* kmap_atomic() ensures addressability of the user buffer.*/
 /* local_irq_save() protects the KM_IRQ0 address slot. */
 local_irq_save(flags);
-buffer = kmap_atomic(sg_page(sg)) + sg->offset;
-if (buffer && buffer[0] == 'C' && buffer[1] == 'O' &&
+buffer = sg_map(sg, 0, SG_KMAP_ATOMIC);
+if (!IS_ERR(buffer) && buffer[0] == 'C' && buffer[1] == 'O' &&
 buffer[2] == 'P' && buffer[3] == 'P') {
-kunmap_atomic(buffer - sg->offset);
+sg_unmap(sg, buffer, 0, SG_KMAP_ATOMIC);
 local_irq_restore(flags);
 return 1;
 }
-kunmap_atomic(buffer - sg->offset);
+sg_unmap(sg, buffer, 0, SG_KMAP_ATOMIC);
 local_irq_restore(flags);
}
return 0;
diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 3c63c29..f8aee59 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -663,10 +663,15 @@ mega_build_cmd(adapter_t *adapter, Scsi_Cmnd *cmd, int 
*busy)
struct scatterlist *sg;
 
sg = scsi_sglist(cmd);
-   buf = kmap_atomic(sg_page(sg)) + sg->offset;
+   buf = sg_map(sg, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(buf)) {
+cmd->result = (DID_ERROR << 16);
+   cmd->scsi_done(cmd);
+   return NULL;
+   }
 
memset(buf, 0, cmd->cmnd[4]);
-   kunmap_atomic(buf - sg->offset);
+   sg_unmap(sg, buf, 0, SG_KMAP_ATOMIC);
 
cmd->result = (DID_OK << 16);
cmd->scsi_done(cmd);
-- 
2.1.4



[PATCH v2 19/21] mmc: sdricoh_cs: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
This is a straightforward conversion to the new function.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Sascha Sommer <saschasom...@freenet.de>
Cc: Ulf Hansson <ulf.hans...@linaro.org>
---
 drivers/mmc/host/sdricoh_cs.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/mmc/host/sdricoh_cs.c b/drivers/mmc/host/sdricoh_cs.c
index 5ff26ab..03225c3 100644
--- a/drivers/mmc/host/sdricoh_cs.c
+++ b/drivers/mmc/host/sdricoh_cs.c
@@ -319,16 +319,20 @@ static void sdricoh_request(struct mmc_host *mmc, struct 
mmc_request *mrq)
for (i = 0; i < data->blocks; i++) {
size_t len = data->blksz;
u8 *buf;
-   struct page *page;
int result;
-   page = sg_page(data->sg);
 
-   buf = kmap(page) + data->sg->offset + (len * i);
+   buf = sg_map(data->sg, (len * i), SG_KMAP);
+   if (IS_ERR(buf)) {
+   cmd->error = PTR_ERR(buf);
+   break;
+   }
+
result =
sdricoh_blockio(host,
data->flags & MMC_DATA_READ, buf, len);
-   kunmap(page);
-   flush_dcache_page(page);
+   sg_unmap(data->sg, buf, (len * i), SG_KMAP);
+
+   flush_dcache_page(sg_page(data->sg));
if (result) {
dev_err(dev, "sdricoh_request: cmd %i "
"block transfer failed\n", cmd->opcode);
-- 
2.1.4



[PATCH v2 20/21] mmc: tifm_sd: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
This conversion is a bit complicated. We modiy the read_fifo,
write_fifo and copy_page functions to take a scatterlist instead of a
page. Thus we can use sg_map instead of kmap_atomic. There's a bit of
accounting that needed to be done for the offset for this to work.
(Seeing sg_map takes care of the offset but it's already added and
used earlier in the code.)

There's also no error path, so we use SG_MAP_MUST_NOT_FAIL which may
BUG_ON in certain cases in the future.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Alex Dubov <oa...@yahoo.com>
Cc: Ulf Hansson <ulf.hans...@linaro.org>
---
 drivers/mmc/host/tifm_sd.c | 50 +++---
 1 file changed, 29 insertions(+), 21 deletions(-)

diff --git a/drivers/mmc/host/tifm_sd.c b/drivers/mmc/host/tifm_sd.c
index 93c4b40..e64345a 100644
--- a/drivers/mmc/host/tifm_sd.c
+++ b/drivers/mmc/host/tifm_sd.c
@@ -111,14 +111,16 @@ struct tifm_sd {
 };
 
 /* for some reason, host won't respond correctly to readw/writew */
-static void tifm_sd_read_fifo(struct tifm_sd *host, struct page *pg,
+static void tifm_sd_read_fifo(struct tifm_sd *host, struct scatterlist *sg,
  unsigned int off, unsigned int cnt)
 {
struct tifm_dev *sock = host->dev;
unsigned char *buf;
unsigned int pos = 0, val;
 
-   buf = kmap_atomic(pg) + off;
+   buf = sg_map(sg, off - sg->offset,
+SG_KMAP_ATOMIC | SG_MAP_MUST_NOT_FAIL);
+
if (host->cmd_flags & DATA_CARRY) {
buf[pos++] = host->bounce_buf_data[0];
host->cmd_flags &= ~DATA_CARRY;
@@ -134,17 +136,19 @@ static void tifm_sd_read_fifo(struct tifm_sd *host, 
struct page *pg,
}
buf[pos++] = (val >> 8) & 0xff;
}
-   kunmap_atomic(buf - off);
+   sg_unmap(sg, buf, off - sg->offset, SG_KMAP_ATOMIC);
 }
 
-static void tifm_sd_write_fifo(struct tifm_sd *host, struct page *pg,
+static void tifm_sd_write_fifo(struct tifm_sd *host, struct scatterlist *sg,
   unsigned int off, unsigned int cnt)
 {
struct tifm_dev *sock = host->dev;
unsigned char *buf;
unsigned int pos = 0, val;
 
-   buf = kmap_atomic(pg) + off;
+   buf = sg_map(sg, off - sg->offset,
+SG_KMAP_ATOMIC | SG_MAP_MUST_NOT_FAIL);
+
if (host->cmd_flags & DATA_CARRY) {
val = host->bounce_buf_data[0] | ((buf[pos++] << 8) & 0xff00);
writel(val, sock->addr + SOCK_MMCSD_DATA);
@@ -161,7 +165,7 @@ static void tifm_sd_write_fifo(struct tifm_sd *host, struct 
page *pg,
val |= (buf[pos++] << 8) & 0xff00;
writel(val, sock->addr + SOCK_MMCSD_DATA);
}
-   kunmap_atomic(buf - off);
+   sg_unmap(sg, buf, off - sg->offset, SG_KMAP_ATOMIC);
 }
 
 static void tifm_sd_transfer_data(struct tifm_sd *host)
@@ -170,7 +174,6 @@ static void tifm_sd_transfer_data(struct tifm_sd *host)
struct scatterlist *sg = r_data->sg;
unsigned int off, cnt, t_size = TIFM_MMCSD_FIFO_SIZE * 2;
unsigned int p_off, p_cnt;
-   struct page *pg;
 
if (host->sg_pos == host->sg_len)
return;
@@ -192,33 +195,39 @@ static void tifm_sd_transfer_data(struct tifm_sd *host)
}
off = sg[host->sg_pos].offset + host->block_pos;
 
-   pg = nth_page(sg_page([host->sg_pos]), off >> PAGE_SHIFT);
p_off = offset_in_page(off);
p_cnt = PAGE_SIZE - p_off;
p_cnt = min(p_cnt, cnt);
p_cnt = min(p_cnt, t_size);
 
if (r_data->flags & MMC_DATA_READ)
-   tifm_sd_read_fifo(host, pg, p_off, p_cnt);
+   tifm_sd_read_fifo(host, [host->sg_pos], p_off,
+ p_cnt);
else if (r_data->flags & MMC_DATA_WRITE)
-   tifm_sd_write_fifo(host, pg, p_off, p_cnt);
+   tifm_sd_write_fifo(host, [host->sg_pos], p_off,
+  p_cnt);
 
t_size -= p_cnt;
host->block_pos += p_cnt;
}
 }
 
-static void tifm_sd_copy_page(struct page *dst, unsigned int dst_off,
- struct page *src, unsigned int src_off,
+static void tifm_sd_copy_page(struct scatterlist *dst, unsigned int dst_off,
+ struct scatterlist *src, unsigned int src_off,
  unsigned int count)
 {
-   unsigned char *src_buf = kmap_atomic(src) + src_off;
-   unsigned char *dst_buf = kmap_atomic(dst) + dst_off;
+   unsigned char *src_buf, *dst_buf;
+
+   src_off -= src->offset;
+   dst_off -= dst->offset;
+
+   src_buf = sg_map(src, src_off, 

[PATCH v2 10/21] RDS: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
Straightforward conversion except there's no error path, so we
make use of SG_MAP_MUST_NOT_FAIL which may BUG_ON in certain cases
in the future.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Santosh Shilimkar <santosh.shilim...@oracle.com>
Cc: "David S. Miller" <da...@davemloft.net>
---
 net/rds/ib_recv.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index e10624a..c665689 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -800,10 +800,10 @@ static void rds_ib_cong_recv(struct rds_connection *conn,
 
to_copy = min(RDS_FRAG_SIZE - frag_off, PAGE_SIZE - map_off);
BUG_ON(to_copy & 7); /* Must be 64bit aligned. */
+   addr = sg_map(>f_sg, 0,
+ SG_KMAP_ATOMIC | SG_MAP_MUST_NOT_FAIL);
 
-   addr = kmap_atomic(sg_page(>f_sg));
-
-   src = addr + frag->f_sg.offset + frag_off;
+   src = addr + frag_off;
dst = (void *)map->m_page_addrs[map_page] + map_off;
for (k = 0; k < to_copy; k += 8) {
/* Record ports that became uncongested, ie
@@ -811,7 +811,7 @@ static void rds_ib_cong_recv(struct rds_connection *conn,
uncongested |= ~(*src) & *dst;
*dst++ = *src++;
}
-   kunmap_atomic(addr);
+   sg_unmap(>f_sg, addr, 0, SG_KMAP_ATOMIC);
 
copied += to_copy;
 
-- 
2.1.4



[PATCH v2 01/21] scatterlist: Introduce sg_map helper functions

2017-04-25 Thread Logan Gunthorpe
This patch introduces functions which kmap the pages inside an sgl.
These functions replace a common pattern of kmap(sg_page(sg)) that is
used in more than 50 places within the kernel.

The motivation for this work is to eventually safely support sgls that
contain io memory. In order for that to work, any access to the contents
of an iomem SGL will need to be done with iomemcpy or hit some warning.
(The exact details of how this will work have yet to be worked out.)
Having all the kmaps in one place is just a first step in that
direction. Additionally, seeing this helps cut down the users of sg_page,
it should make any effort to go to struct-page-less DMAs a little
easier (should that idea ever swing back into favour again).

A flags option is added to select between a regular or atomic mapping so
these functions can replace kmap(sg_page or kmap_atomic(sg_page.
Future work may expand this to have flags for using page_address or
vmap. We include a flag to require the function not to fail to
support legacy code that has no easy error path. Much further in the
future, there may be a flag to allocate memory and copy the data
from/to iomem.

We also add the semantic that sg_map can fail to create a mapping,
despite the fact that the current code this is replacing is assumed to
never fail and the current version of these functions cannot fail. This
is to support iomem which may either have to fail to create the mapping or
allocate memory as a bounce buffer which itself can fail.

Also, in terms of cleanup, a few of the existing kmap(sg_page) users
play things a bit loose in terms of whether they apply sg->offset
so using these helper functions should help avoid such issues.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 include/linux/scatterlist.h | 85 +
 1 file changed, 85 insertions(+)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe..fad170b 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 struct scatterlist {
@@ -126,6 +127,90 @@ static inline struct page *sg_page(struct scatterlist *sg)
return (struct page *)((sg)->page_link & ~0x3);
 }
 
+#define SG_KMAP (1 << 0)   /* create a mapping with kmap */
+#define SG_KMAP_ATOMIC  (1 << 1)   /* create a mapping with kmap_atomic */
+#define SG_MAP_MUST_NOT_FAIL (1 << 2)  /* indicate sg_map should not fail */
+
+/**
+ * sg_map - kmap a page inside an sgl
+ * @sg:SG entry
+ * @offset:Offset into entry
+ * @flags: Flags for creating the mapping
+ *
+ * Description:
+ *   Use this function to map a page in the scatterlist at the specified
+ *   offset. sg->offset is already added for you. Note: the semantics of
+ *   this function are that it may fail. Thus, its output should be checked
+ *   with IS_ERR and PTR_ERR. Otherwise, a pointer to the specified offset
+ *   in the mapped page is returned.
+ *
+ *   Flags can be any of:
+ * * SG_KMAP   - Use kmap to create the mapping
+ * * SG_KMAP_ATOMIC- Use kmap_atomic to map the page atommically.
+ *   Thus, the rules of that function apply: the
+ *   cpu may not sleep until it is unmaped.
+ * * SG_MAP_MUST_NOT_FAIL  - Indicate that sg_map must not fail.
+ *   If it does, it will issue a BUG_ON instead.
+ *   This is intended for legacy code only, it
+ *   is not to be used in new code.
+ *
+ *   Also, consider carefully whether this function is appropriate. It is
+ *   largely not recommended for new code and if the sgl came from another
+ *   subsystem and you don't know what kind of memory might be in the list
+ *   then you definitely should not call it. Non-mappable memory may be in
+ *   the sgl and thus this function may fail unexpectedly. Consider using
+ *   sg_copy_to_buffer instead.
+ **/
+static inline void *sg_map(struct scatterlist *sg, size_t offset, int flags)
+{
+   struct page *pg;
+   unsigned int pg_off;
+   void *ret;
+
+   offset += sg->offset;
+   pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
+   pg_off = offset_in_page(offset);
+
+   if (flags & SG_KMAP_ATOMIC)
+   ret = kmap_atomic(pg) + pg_off;
+   else if (flags & SG_KMAP)
+   ret = kmap(pg) + pg_off;
+   else
+   ret = ERR_PTR(-EINVAL);
+
+   /*
+* In theory, this can't happen yet. Once we start adding
+* unmapable memory, it also shouldn't happen unless developers
+* start putting unmappable struct pages in sgls and passing
+* it to code that doesn't support it.
+*/
+   BUG_ON(flags & SG_MAP_MUST_NOT_FAIL && IS_ERR(ret));
+
+  

[PATCH v2 08/21] dm-crypt: Make use of the new sg_map helper in 4 call sites

2017-04-25 Thread Logan Gunthorpe
Very straightforward conversion to the new function in all four spots.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Cc: Alasdair Kergon <a...@redhat.com>
Cc: Mike Snitzer <snit...@redhat.com>
---
 drivers/md/dm-crypt.c | 39 ++-
 1 file changed, 26 insertions(+), 13 deletions(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 8dbecf1..841f1fc 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -635,9 +635,12 @@ static int crypt_iv_lmk_gen(struct crypt_config *cc, u8 
*iv,
 
if (bio_data_dir(dmreq->ctx->bio_in) == WRITE) {
sg = crypt_get_sg_data(cc, dmreq->sg_in);
-   src = kmap_atomic(sg_page(sg));
-   r = crypt_iv_lmk_one(cc, iv, dmreq, src + sg->offset);
-   kunmap_atomic(src);
+   src = sg_map(sg, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(src))
+   return PTR_ERR(src);
+
+   r = crypt_iv_lmk_one(cc, iv, dmreq, src);
+   sg_unmap(sg, src, 0, SG_KMAP_ATOMIC);
} else
memset(iv, 0, cc->iv_size);
 
@@ -655,14 +658,18 @@ static int crypt_iv_lmk_post(struct crypt_config *cc, u8 
*iv,
return 0;
 
sg = crypt_get_sg_data(cc, dmreq->sg_out);
-   dst = kmap_atomic(sg_page(sg));
-   r = crypt_iv_lmk_one(cc, iv, dmreq, dst + sg->offset);
+   dst = sg_map(sg, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(dst))
+   return PTR_ERR(dst);
+
+   r = crypt_iv_lmk_one(cc, iv, dmreq, dst);
 
/* Tweak the first block of plaintext sector */
if (!r)
-   crypto_xor(dst + sg->offset, iv, cc->iv_size);
+   crypto_xor(dst, iv, cc->iv_size);
+
+   sg_unmap(sg, dst, 0, SG_KMAP_ATOMIC);
 
-   kunmap_atomic(dst);
return r;
 }
 
@@ -786,9 +793,12 @@ static int crypt_iv_tcw_gen(struct crypt_config *cc, u8 
*iv,
/* Remove whitening from ciphertext */
if (bio_data_dir(dmreq->ctx->bio_in) != WRITE) {
sg = crypt_get_sg_data(cc, dmreq->sg_in);
-   src = kmap_atomic(sg_page(sg));
-   r = crypt_iv_tcw_whitening(cc, dmreq, src + sg->offset);
-   kunmap_atomic(src);
+   src = sg_map(sg, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(src))
+   return PTR_ERR(src);
+
+   r = crypt_iv_tcw_whitening(cc, dmreq, src);
+   sg_unmap(sg, src, 0, SG_KMAP_ATOMIC);
}
 
/* Calculate IV */
@@ -812,9 +822,12 @@ static int crypt_iv_tcw_post(struct crypt_config *cc, u8 
*iv,
 
/* Apply whitening on ciphertext */
sg = crypt_get_sg_data(cc, dmreq->sg_out);
-   dst = kmap_atomic(sg_page(sg));
-   r = crypt_iv_tcw_whitening(cc, dmreq, dst + sg->offset);
-   kunmap_atomic(dst);
+   dst = sg_map(sg, 0, SG_KMAP_ATOMIC);
+   if (IS_ERR(dst))
+   return PTR_ERR(dst);
+
+   r = crypt_iv_tcw_whitening(cc, dmreq, dst);
+   sg_unmap(sg, dst, 0, SG_KMAP_ATOMIC);
 
return r;
 }
-- 
2.1.4



[PATCH v2 05/21] drm/i915: Make use of the new sg_map helper function

2017-04-25 Thread Logan Gunthorpe
This is a single straightforward conversion from kmap to sg_map.

We also create the i915_gem_object_unmap function to common up the
unmap code.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
Acked-by: Daniel Vetter <daniel.vet...@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 07e9b27..2c33000 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2202,6 +2202,15 @@ static void __i915_gem_object_reset_page_iter(struct 
drm_i915_gem_object *obj)
radix_tree_delete(>mm.get_page.radix, iter.index);
 }
 
+static void i915_gem_object_unmap(const struct drm_i915_gem_object *obj,
+ void *ptr)
+{
+   if (is_vmalloc_addr(ptr))
+   vunmap(ptr);
+   else
+   sg_unmap(obj->mm.pages->sgl, ptr, 0, SG_KMAP);
+}
+
 void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
 enum i915_mm_subclass subclass)
 {
@@ -2229,10 +2238,7 @@ void __i915_gem_object_put_pages(struct 
drm_i915_gem_object *obj,
void *ptr;
 
ptr = ptr_mask_bits(obj->mm.mapping);
-   if (is_vmalloc_addr(ptr))
-   vunmap(ptr);
-   else
-   kunmap(kmap_to_page(ptr));
+   i915_gem_object_unmap(obj, ptr);
 
obj->mm.mapping = NULL;
}
@@ -2499,8 +2505,11 @@ static void *i915_gem_object_map(const struct 
drm_i915_gem_object *obj,
void *addr;
 
/* A single page can always be kmapped */
-   if (n_pages == 1 && type == I915_MAP_WB)
-   return kmap(sg_page(sgt->sgl));
+   if (n_pages == 1 && type == I915_MAP_WB) {
+   addr = sg_map(sgt->sgl, 0, SG_KMAP);
+   if (IS_ERR(addr))
+   return NULL;
+   }
 
if (n_pages > ARRAY_SIZE(stack_pages)) {
/* Too big for stack -- allocate temporary array instead */
@@ -2567,11 +2576,7 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object 
*obj,
goto err_unpin;
}
 
-   if (is_vmalloc_addr(ptr))
-   vunmap(ptr);
-   else
-   kunmap(kmap_to_page(ptr));
-
+   i915_gem_object_unmap(obj, ptr);
ptr = obj->mm.mapping = NULL;
}
 
-- 
2.1.4



[PATCH v2 02/21] libiscsi: Add an internal error code

2017-04-25 Thread Logan Gunthorpe
This is a prep patch to add a new error code to libiscsi. We want to
rework some kmap calls to be able to fail. When we do, we'd like to
use this error code.

This patch simply introduces ISCSI_TCP_INTERNAL_ERR and prints
"Internal Error." when it gets hit.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/scsi/cxgbi/libcxgbi.c | 5 +
 include/scsi/libiscsi_tcp.h   | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
index bd7d39e..e38d0c1 100644
--- a/drivers/scsi/cxgbi/libcxgbi.c
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -1556,6 +1556,11 @@ static inline int read_pdu_skb(struct iscsi_conn *conn,
 */
iscsi_conn_printk(KERN_ERR, conn, "Invalid pdu or skb.");
return -EFAULT;
+   case ISCSI_TCP_INTERNAL_ERR:
+   pr_info("skb 0x%p, off %u, %d, TCP_INTERNAL_ERR.\n",
+   skb, offset, offloaded);
+   iscsi_conn_printk(KERN_ERR, conn, "Internal error.");
+   return -EFAULT;
case ISCSI_TCP_SEGMENT_DONE:
log_debug(1 << CXGBI_DBG_PDU_RX,
"skb 0x%p, off %u, %d, TCP_SEG_DONE, rc %d.\n",
diff --git a/include/scsi/libiscsi_tcp.h b/include/scsi/libiscsi_tcp.h
index 30520d5..90691ad 100644
--- a/include/scsi/libiscsi_tcp.h
+++ b/include/scsi/libiscsi_tcp.h
@@ -92,6 +92,7 @@ enum {
ISCSI_TCP_SKB_DONE, /* skb is out of data */
ISCSI_TCP_CONN_ERR, /* iscsi layer has fired a conn err */
ISCSI_TCP_SUSPENDED,/* conn is suspended */
+   ISCSI_TCP_INTERNAL_ERR, /* an internal error occurred */
 };
 
 extern void iscsi_tcp_hdr_recv_prep(struct iscsi_tcp_conn *tcp_conn);
-- 
2.1.4



Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-25 Thread Logan Gunthorpe


On 25/04/17 12:30 AM, Knut Omang wrote:
> Yes, that's why I used 'significant'. One good thing is that given resources 
> it can easily be done in parallel with other development, and will give 
> additional
> insight of some form.

Yup, well if someone wants to start working on an emulated RDMA device
that actually simulates proper DMA transfers that would be great!

>> I also imagine it would be quite difficult to develop those models
>> given the array of hardware that needs to be supported and the deep
>> functional knowledge required to figure out appropriate restrictions.
> 
> From my naive perspective it seems it need not even be a full model to get 
> some benefits,
> just low level functionality tests with some instances of a
> device that offers some MMIO space 'playground'.

Yes, the nvme device in qemu has a CMB buffer which is a good choice to
test with but we don't have code to use it for p2p transfers in the
kernel so it is a bit awkward. We also posted [1] to qemu which was also
handy to test with using a small out of tree module. But I don't see a
lot of value in it if you completely ignore any hardware quirks.

Logan

[1] https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg04331.html


Re: [RFC 1/8] Introduce Peer-to-Peer memory (p2pmem) device

2017-04-25 Thread Logan Gunthorpe


On 25/04/17 05:58 AM, Marta Rybczynska wrote:
> I would add one issue that doesn't seem to be addressed: in my experience
> P2P doesn't work when IOMMU activated. It works best with deactivation at
> the BIOS level, even the kernel options are not enough in some cases.

Well this would likely be addressed by 'arch_p2p_cross_segment' as
proposed by Jason.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-24 Thread Logan Gunthorpe


On 24/04/17 01:36 AM, Knut Omang wrote:
> My first reflex when reading this thread was to think that this whole domain
> lends it self excellently to testing via Qemu. Could it be that doing this in 
> the opposite direction might be a safer approach in the long run even though 
> (significant) more work up-front?

That's an interesting idea. We did do some very limited testing on qemu
with one iteration of our work. However, it's difficult because there is
no support for any RDMA devices which are a part of our primary use
case. I also imagine it would be quite difficult to develop those models
given the array of hardware that needs to be supported and the deep
functional knowledge required to figure out appropriate restrictions.

Logan


[PATCH] libiscsi: Add an internal error code

2017-04-21 Thread Logan Gunthorpe
This is a prep patch to add a new error code to libiscsi. We want to
rework some kmap calls to be able to fail. When we do, we'd like to
use this error code.

This patch simply introduces ISCSI_TCP_INTERNAL_ERR and prints
"Internal Error." when it gets hit.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---

Please see these links for additional context.

https://patchwork.kernel.org/patch/9680329/
https://patchwork.kernel.org/patch/9680091/

Thanks,

Logan

 drivers/scsi/cxgbi/libcxgbi.c | 5 +
 include/scsi/libiscsi_tcp.h   | 1 +
 2 files changed, 6 insertions(+)

diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
index bd7d39e..e38d0c1 100644
--- a/drivers/scsi/cxgbi/libcxgbi.c
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -1556,6 +1556,11 @@ static inline int read_pdu_skb(struct iscsi_conn *conn,
 */
iscsi_conn_printk(KERN_ERR, conn, "Invalid pdu or skb.");
return -EFAULT;
+   case ISCSI_TCP_INTERNAL_ERR:
+   pr_info("skb 0x%p, off %u, %d, TCP_INTERNAL_ERR.\n",
+   skb, offset, offloaded);
+   iscsi_conn_printk(KERN_ERR, conn, "Internal error.");
+   return -EFAULT;
case ISCSI_TCP_SEGMENT_DONE:
log_debug(1 << CXGBI_DBG_PDU_RX,
"skb 0x%p, off %u, %d, TCP_SEG_DONE, rc %d.\n",
diff --git a/include/scsi/libiscsi_tcp.h b/include/scsi/libiscsi_tcp.h
index 30520d5..90691ad 100644
--- a/include/scsi/libiscsi_tcp.h
+++ b/include/scsi/libiscsi_tcp.h
@@ -92,6 +92,7 @@ enum {
ISCSI_TCP_SKB_DONE, /* skb is out of data */
ISCSI_TCP_CONN_ERR, /* iscsi layer has fired a conn err */
ISCSI_TCP_SUSPENDED,/* conn is suspended */
+   ISCSI_TCP_INTERNAL_ERR, /* an internal error occurred */
 };

 extern void iscsi_tcp_hdr_recv_prep(struct iscsi_tcp_conn *tcp_conn);
--
2.1.4


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-19 Thread Logan Gunthorpe


On 19/04/17 02:48 PM, Jason Gunthorpe wrote:
> On Wed, Apr 19, 2017 at 01:41:49PM -0600, Logan Gunthorpe wrote:
> 
>>> But.. it could point to a GPU and the GPU struct device could have a
>>> proxy dma_ops like Dan pointed out.
>>
>> Seems a bit awkward to me that in order for the intended use case, you
>> have to proxy the dma_ops. I'd probably still suggest throwing a couple
>> ops for things like this in the dev_pagemap.
> 
> Another option is adding a new 'struct completer_dma_ops *' to struct
> device for this use case.
> 
> Seems like a waste to expand dev_pagemap when we only need a unique
> value per struct device?

I feel like expanding dev_pagemap has a much lower impact than expanding
struct device... dev_pagemap is only one instance per zone device region
so expanding it shouldn't be a huge issue. Expanding struct device means
every device struct in the system gets bigger.

Logan



Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-19 Thread Logan Gunthorpe


On 19/04/17 01:31 PM, Jason Gunthorpe wrote:
> Try it with VT-D turned on. It shouldn't work or there is a notable
> security hole in your platform..

Ah, ok.

>>>  const struct dma_map_ops *comp_ops = get_dma_ops(completer);
>>>  const struct dma_map_ops *init_ops = get_dma_ops(initiator);
>>
>> So, in this case, what device does the completer point to? The PCI
>> device or a more specific GPU device? If it's the former, who's
>> responsible for setting the new dma_ops? Typically the dma_ops are arch
>> specific but now you'd be adding ones that are tied to hmm or the gpu.
> 
> Donno, that is for GPU folks to figure out :)
> 
> But.. it could point to a GPU and the GPU struct device could have a
> proxy dma_ops like Dan pointed out.

Seems a bit awkward to me that in order for the intended use case, you
have to proxy the dma_ops. I'd probably still suggest throwing a couple
ops for things like this in the dev_pagemap.

>> It appears to me like it's calculating the DMA address, and the check is
>> just a side requirement. It reads as though it's only doing the check.
> 
> pci_p2p_same_segment_get_pa() then?

Ok, I think that's a bit clearer.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-19 Thread Logan Gunthorpe


On 19/04/17 12:32 PM, Jason Gunthorpe wrote:
> On Wed, Apr 19, 2017 at 12:01:39PM -0600, Logan Gunthorpe wrote:
> Not entirely, it would have to call through the whole process
> including the arch_p2p_cross_segment()..

Hmm, yes. Though it's still not clear what, if anything,
arch_p2p_cross_segment would be doing. In my experience, if you are
going between host bridges, the CPU address (or PCI address -- I'm not
sure which seeing they are the same on my system) would still work fine
-- it just _may_ be a bad idea because of performance.

Similarly if you are crossing via a QPI bus or similar, I'd expect the
CPU address to work fine. But here the performance is even less likely
to be any good.


>  // Try the arch specific helper
>const struct dma_map_ops *comp_ops = get_dma_ops(completer);
>const struct dma_map_ops *init_ops = get_dma_ops(initiator);

So, in this case, what device does the completer point to? The PCI
device or a more specific GPU device? If it's the former, who's
responsible for setting the new dma_ops? Typically the dma_ops are arch
specific but now you'd be adding ones that are tied to hmm or the gpu.

>> I'm not sure I like the name pci_p2p_same_segment. It reads as though
>> it's only checking if the devices are not the same segment.
> 
> Well, that is exactly what it is doing. If it succeeds then the caller
> knows the DMA will not flow outside the segment and no iommu setup/etc
> is required.

It appears to me like it's calculating the DMA address, and the check is
just a side requirement. It reads as though it's only doing the check.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-19 Thread Logan Gunthorpe


On 19/04/17 12:30 PM, Dan Williams wrote:
> Letting others users do the container_of() arrangement means that
> struct page_map needs to become public and move into struct
> dev_pagemap directly.

Ah, yes, I got a bit turned around by that and failed to notice that
page_map and dev_pagemap are different. Why is it that dev_pagemap
contains pretty much the exact same information as page_map? The only
thing gained that I can see is that the struct resource gains const
protection...

> ...I think that encapsulation loss is worth it for the gain of clearly
> separating the HMM-case from the base case.

Agreed.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-19 Thread Logan Gunthorpe


On 19/04/17 12:11 PM, Logan Gunthorpe wrote:
> 
> 
> On 19/04/17 11:41 AM, Dan Williams wrote:
>> No, not quite ;-). I still don't think we should require the non-HMM
>> to pass NULL for all the HMM arguments. What I like about Logan's
>> proposal is to have a separate create and register steps dev_pagemap.
>> That way call paths that don't care about HMM specifics can just turn
>> around and register the vanilla dev_pagemap.
> 
> Would you necessarily even need a create step? I was thinking more along
> the lines that struct dev_pagemap _could_ just be a member in another
> structure. The caller would set the attributes they needed and pass it
> to devm_memremap. (Similar to how we commonly do things with struct
> device, et al). Potentially, that could also get rid of the need for the
> *data pointer HMM is using to get back the struct hmm_devmem seeing
> container_of could be used instead.

Also, now that I've thought about it a little more, it _may_ be that
many or all of the hmm specific fields in dev_pagemap could move to a
containing struct too...

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-19 Thread Logan Gunthorpe


On 19/04/17 11:41 AM, Dan Williams wrote:
> No, not quite ;-). I still don't think we should require the non-HMM
> to pass NULL for all the HMM arguments. What I like about Logan's
> proposal is to have a separate create and register steps dev_pagemap.
> That way call paths that don't care about HMM specifics can just turn
> around and register the vanilla dev_pagemap.

Would you necessarily even need a create step? I was thinking more along
the lines that struct dev_pagemap _could_ just be a member in another
structure. The caller would set the attributes they needed and pass it
to devm_memremap. (Similar to how we commonly do things with struct
device, et al). Potentially, that could also get rid of the need for the
*data pointer HMM is using to get back the struct hmm_devmem seeing
container_of could be used instead.

>> Note i won't make any change now on that front but if it make sense
>> i am happy to do it as separate patchset on top of HMM.
>>
>> Also i don't want p2pmem to be an exclusive or with HMM, we will want
>> GPU to do peer to peer DMA and thus HMM ZONE_DEVICE pages to support
>> this too.
> 
> Yes, this makes sense I think we really just want to distinguish host
> memory or not in terms of the dev_pagemap type.

That depends though. If unaddressable memory needs different steps to
get to the DMA address (ie if it has to setup a gpu window) then we may
still need a way to distinguish the two types of non-host memory.

Logan



Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-19 Thread Logan Gunthorpe


On 19/04/17 11:14 AM, Jason Gunthorpe wrote:
> I don't see a use for the dma_map function pointer at this point..

Yes, it is kind of like designing for the future. I just find it a
little odd calling the pci functions in the iommu.


> It doesn't make alot of sense for the completor of the DMA to provide
> a mapping op, the mapping process is *path* specific, not specific to
> a completer/initiator.

I'm just spit balling here but if HMM wanted to use unaddressable memory
as a DMA target, it could set that function to create a window ine gpu
memory, then call the pci_p2p_same_segment and return the result as the
dma address.


> dma_addr_t pci_p2p_same_segment(struct device *initator,
> struct device *completer,
>   struct page *page)

I'm not sure I like the name pci_p2p_same_segment. It reads as though
it's only checking if the devices are not the same segment. It also may
be that, in the future, it supports devices on different segments. I'd
call it more like pci_p2p_dma_map.


> for (each sgl) {

Thanks, this code fleshes things out nicely


> To me it looks like the code duplication across the iommu stuff comes
> from just duplicating the basic iommu algorithm in every driver.

Yes, this is true.

> To clean that up I think someone would need to hoist the overall sgl
> loop and use more ops callbacks eg allocate_iommu_range,
> assign_page_to_rage, dealloc_range, etc. This is a problem p2p makes
> worse, but isn't directly causing :\

Yup.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-19 Thread Logan Gunthorpe


On 19/04/17 09:55 AM, Jason Gunthorpe wrote:
> I was thinking only this one would be supported with a core code
> helper..

Pivoting slightly: I was looking at how HMM uses ZONE_DEVICE. They add a
type flag to the dev_pagemap structure which would be very useful to us.
We could add another MEMORY_DEVICE_P2P type to distinguish p2p pages.
Then, potentially, we could add a dma_map callback to the structure
(possibly unioned with an hmm field). The dev_ops providers would then
just need to do something like this (enclosed in a helper):

if (is_zone_device_page(page)) {
pgmap = get_dev_pagemap(page_to_pfn(page));
if (!pgmap || pgmap->type !=  MEMORY_DEVICE_P2P ||
!pgmap->dma_map)
return 0;

dma_addr = pgmap->dma_map(dev, pgmap->dev, page);
put_dev_pagemap(pgmap);
if (!dma_addr)
return 0;
...
}

The pci_enable_p2p_bar function would then just need to call
devm_memremap_pages with the dma_map callback set to a function that
does the segment check and the offset calculation.

Thoughts?

@Jerome: my feedback to you would be that your patch assumes all users
of devm_memremap_pages are MEMORY_DEVICE_PERSISTENT. It would be more
useful if it was generic. My suggestion would be to have the caller
allocate the dev_pagemap structure, populate it and pass it into
devm_memremap_pages. Given that pretty much everything in that structure
are already arguments to that function, I feel like this makes sense.
This should also help to unify hmm_devmem_pages_create and
devm_memremap_pages which look very similar to each other.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 04:24 PM, Jason Gunthorpe wrote:
> Try and write a stacked map_sg function like you describe and you will
> see how horrible it quickly becomes.

Yes, unfortunately, I have to agree with this statement completely.

> Since dma mapping is a performance path we must be careful not to
> create intrinsic inefficiencies with otherwise nice layering :)

Yeah, I'm also personally thinking your proposal is the way to go as
well. Dan's injected ops suggestion is interesting but I can't see how
it solves the issue completely. Your proposal is the only one that seems
to be complete to me. It just has a few minor pain points which I've
already described but are likely manageable and less than the pain
stacked dma_ops creates.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 04:50 PM, Dan Williams wrote:
> On Tue, Apr 18, 2017 at 3:48 PM, Logan Gunthorpe <log...@deltatee.com> wrote:
>>
>>
>> On 18/04/17 04:28 PM, Dan Williams wrote:
>>> Unlike the pci bus address offset case which I think is fundamental to
>>> support since shipping archs do this today, I think it is ok to say
>>> p2p is restricted to a single sgl that gets to talk to host memory or
>>> a single device. That said, what's wrong with a p2p aware map_sg
>>> implementation calling up to the host memory map_sg implementation on
>>> a per sgl basis?
>>
>> I think Ben said they need mixed sgls and that is where this gets messy.
>> I think I'd prefer this too given trying to enforce all sgs in a list to
>> be one type or another could be quite difficult given the state of the
>> scatterlist code.
>>
>>>> Also, what happens if p2p pages end up getting passed to a device that
>>>> doesn't have the injected dma_ops?
>>>
>>> This goes back to limiting p2p to a single pci host bridge. If the p2p
>>> capability is coordinated with the bridge rather than between the
>>> individual devices then we have a central point to catch this case.
>>
>> Not really relevant. If these pages get to userspace (as people seem
>> keen on doing) or a less than careful kernel driver they could easily
>> get into the dma_map calls of devices that aren't even pci related (via
>> an O_DIRECT operation on an incorrect file or something). The common
>> code must reject these and can't rely on an injected dma op.
> 
> No, we can't do that at get_user_pages() time, it will always need to
> be up to the device driver to fail dma that it can't perform.

I'm not sure I follow -- are you agreeing with me? The dma_map_* needs
to fail for any dma it cannot perform. Which means either all dma_ops
providers need to be p2p aware or this logic has to be in dma_map_*
itself. My point being: you can't rely on an injected dma_op for some
devices to handle the fail case globally.




Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 04:28 PM, Dan Williams wrote:
> Unlike the pci bus address offset case which I think is fundamental to
> support since shipping archs do this today, I think it is ok to say
> p2p is restricted to a single sgl that gets to talk to host memory or
> a single device. That said, what's wrong with a p2p aware map_sg
> implementation calling up to the host memory map_sg implementation on
> a per sgl basis?

I think Ben said they need mixed sgls and that is where this gets messy.
I think I'd prefer this too given trying to enforce all sgs in a list to
be one type or another could be quite difficult given the state of the
scatterlist code.

>> Also, what happens if p2p pages end up getting passed to a device that
>> doesn't have the injected dma_ops?
> 
> This goes back to limiting p2p to a single pci host bridge. If the p2p
> capability is coordinated with the bridge rather than between the
> individual devices then we have a central point to catch this case.

Not really relevant. If these pages get to userspace (as people seem
keen on doing) or a less than careful kernel driver they could easily
get into the dma_map calls of devices that aren't even pci related (via
an O_DIRECT operation on an incorrect file or something). The common
code must reject these and can't rely on an injected dma op.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 03:36 PM, Dan Williams wrote:
> On Tue, Apr 18, 2017 at 2:22 PM, Jason Gunthorpe
>  wrote:
>> On Tue, Apr 18, 2017 at 02:11:33PM -0700, Dan Williams wrote:
 I think this opens an even bigger can of worms..
>>>
>>> No, I don't think it does. You'd only shim when the target page is
>>> backed by a device, not host memory, and you can figure this out by a
>>> is_zone_device_page()-style lookup.
>>
>> The bigger can of worms is how do you meaningfully stack dma_ops.
> 
> This goes back to my original comment to make this capability a
> function of the pci bridge itself. The kernel has an implementation of
> a dynamically created bridge device that injects its own dma_ops for
> the devices behind the bridge. See vmd_setup_dma_ops() in
> drivers/pci/host/vmd.c.

Well the issue I think Jason is pointing out is that the ops don't
stack. The map_* function in the injected dma_ops needs to be able to
call the original map_* for any page that is not p2p memory. This is
especially annoying in the map_sg function which may need to call a
different op based on the contents of the sgl. (And please correct me if
I'm not seeing how this can be done in the vmd example.)

Also, what happens if p2p pages end up getting passed to a device that
doesn't have the injected dma_ops?

However, the concept of replacing the dma_ops for all devices behind a
supporting bridge is interesting and may be a good piece of the final
solution.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 03:03 PM, Jason Gunthorpe wrote:
> What about something more incremental like this instead:
> - dma_ops will set map_sg_p2p == map_sg when they are updated to
>   support p2p, otherwise DMA on P2P pages will fail for those ops.
> - When all ops support p2p we remove the if and ops->map_sg then
>   just call map_sg_p2p
> - For now the scatterlist maintains a bit when pages are added indicating if
>   p2p memory might be present in the list.
> - Unmap for p2p and non-p2p is the same, the underlying ops driver has
>   to make it work.
> 
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index 0977317c6835c2..505ed7d502053d 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -103,6 +103,9 @@ struct dma_map_ops {
>   int (*map_sg)(struct device *dev, struct scatterlist *sg,
> int nents, enum dma_data_direction dir,
> unsigned long attrs);
> + int (*map_sg_p2p)(struct device *dev, struct scatterlist *sg,
> +   int nents, enum dma_data_direction dir,
> +   unsigned long attrs);
>   void (*unmap_sg)(struct device *dev,
>struct scatterlist *sg, int nents,
>enum dma_data_direction dir,
> @@ -244,7 +247,15 @@ static inline int dma_map_sg_attrs(struct device *dev, 
> struct scatterlist *sg,
>   for_each_sg(sg, s, nents, i)
>   kmemcheck_mark_initialized(sg_virt(s), s->length);
>   BUG_ON(!valid_dma_direction(dir));
> - ents = ops->map_sg(dev, sg, nents, dir, attrs);
> +
> + if (sg_has_p2p(sg)) {
> + if (ops->map_sg_p2p)
> + ents = ops->map_sg_p2p(dev, sg, nents, dir, attrs);
> + else
> + return 0;
> + } else
> + ents = ops->map_sg(dev, sg, nents, dir, attrs);
> +
>   BUG_ON(ents < 0);
>   debug_dma_map_sg(dev, sg, nents, ents, dir);

I could get behind this. Though a couple of points:

1) It means that sg_has_p2p has to walk the entire sg and check every
page. Then map_sg_p2p/map_sg has to walk it again and repeat the check
then do some operation per page. If anyone is concerned about the
dma_map performance this could be an issue.

2) Without knowing exactly what the arch specific code may need to do
it's hard to say that this is exactly the right approach. If every
dma_ops provider has to do exactly this on every page it may lead to a
lot of duplicate code:

foreach_sg page:
   if (pci_page_is_p2p(page)) {
dma_addr = pci_p2p_map_page(page)
if (!dma_addr)
return 0;
continue
   }
   ...

The only thing I'm presently aware of is the segment check and applying
the offset to the physical address -- neither of which has much to do
with the specific dma_ops providers. It _may_ be that this needs to be
bus specific and not arch specific which I think is what Dan may be
getting at. So it may make sense to just have a pci_map_sg_p2p() which
takes a dma_ops struct it would use for any page that isn't a p2p page.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 02:31 PM, Dan Williams wrote:
> On Tue, Apr 18, 2017 at 1:29 PM, Jerome Glisse <jgli...@redhat.com> wrote:
>>> On Tue, Apr 18, 2017 at 12:35 PM, Logan Gunthorpe <log...@deltatee.com>
>>> wrote:
>>>>
>>>>
>>>> On 18/04/17 01:01 PM, Jason Gunthorpe wrote:
>>>>> Ultimately every dma_ops will need special code to support P2P with
>>>>> the special hardware that ops is controlling, so it makes some sense
>>>>> to start by pushing the check down there in the first place. This
>>>>> advice is partially motivated by how dma_map_sg is just a small
>>>>> wrapper around the function pointer call...
>>>>
>>>> Yes, I noticed this problem too and that makes sense. It just means
>>>> every dma_ops will probably need to be modified to either support p2p
>>>> pages or fail on them. Though, the only real difficulty there is that it
>>>> will be a lot of work.
>>>
>>> I don't think you need to go touch all dma_ops, I think you can just
>>> arrange for devices that are going to do dma to get redirected to a
>>> p2p aware provider of operations that overrides the system default
>>> dma_ops. I.e. just touch get_dma_ops().
>>
>> This would not work well for everyone, for instance on GPU we usualy
>> have buffer object with a mix of device memory and regular system
>> memory but call dma sg map once for the list.
>>
> 
> ...and that dma_map goes through get_dma_ops(), so I don't see the conflict?

The main conflict is in dma_map_sg which only does get_dma_ops once but
the sg may contain memory of different types.

Logan



Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 01:48 PM, Jason Gunthorpe wrote:
> I think this is why progress on this keeps getting stuck - every
> solution is a lot of work.

Yup! There's also a ton of work just to get the iomem safety issues
addressed. Let alone the dma mapping issues.

> You could try to do a dummy mapping / create a MR early on to detect
> this.

Ok, that could be a workable solution.

> FWIW, I wonder if from a RDMA perspective we have another
> problem.. Should we allow P2P memory to be used with the local DMA
> lkey? There are potential designs around virtualization that would not
> allow that. Should we mandate that P2P memory be in its own MR?

I can't say I understand these issues...

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 01:01 PM, Jason Gunthorpe wrote:
> Ultimately every dma_ops will need special code to support P2P with
> the special hardware that ops is controlling, so it makes some sense
> to start by pushing the check down there in the first place. This
> advice is partially motivated by how dma_map_sg is just a small
> wrapper around the function pointer call...

Yes, I noticed this problem too and that makes sense. It just means
every dma_ops will probably need to be modified to either support p2p
pages or fail on them. Though, the only real difficulty there is that it
will be a lot of work.

> Where p2p_same_segment_map_page checks if the two devices are on the
> 'same switch' and if so returns the address translated to match the
> bus address programmed into the BAR or fails. We knows this case is
> required to work by the PCI spec, so it makes sense to use it as the
> first canned helper.

I've also suggested that this check should probably be done (or perhaps
duplicated) before we even get to the map stage. In the case of
nvme-fabrics we'd probably want to let the user know when they try to
configure it or at least fall back to allocating regular memory instead.
It would be a difficult situation to have already copied a block of data
from a NIC to p2p memory only to have it be deemed unmappable on the
NVMe device it's destined for. (Or vice-versa.) This was another issue
p2pmem was attempting to solve.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 10:45 AM, Jason Gunthorpe wrote:
> From Ben's comments, I would think that the 'first class' support that
> is needed here is simply a function to return the 'struct device'
> backing a CPU address range.

Yes, and Dan's get_dev_pagemap suggestion gets us 90% of the way there.
It's just a disagreement as to what struct device is inside the pagemap.
Care needs to be taken to ensure that struct device doesn't conflict
with hmm and doesn't limit other potential future users of ZONE_DEVICE.

> If there is going to be more core support for this stuff I think it
> will be under the topic of more robustly describing the fabric to the
> core and core helpers to extract data from the description: eg compute
> the path, check if the path crosses translation, etc

Agreed, those helpers would be useful to everyone.

> I think the key agreement to get out of Logan's series is that P2P DMA
> means:
>  - The BAR will be backed by struct pages
>  - Passing the CPU __iomem address of the BAR to the DMA API is
>valid and, long term, dma ops providers are expected to fail
>or return the right DMA address

Well, yes but we have a _lot_ of work to do to make it safe to pass
around struct pages backed with __iomem. That's where our next focus
will be. I've already taken very initial steps toward this with my
scatterlist map patchset.

>  - Mapping BAR memory into userspace and back to the kernel via
>get_user_pages works transparently, and with the DMA API above

Again, we've had a lot of push back for the memory to go to userspace at
all. It does work, but people expect userspace to screw it up in a lot
of ways. Among the people pushing back on that: Christoph Hellwig has
specifically said he wants to see this stay with in-kernel users only
until the apis can be worked out. This is one of the reasons we decided
to go with enabling nvme-fabrics as everything remains in the kernel.
And with that decision we needed a common in-kernel allocation
infrastructure: this is what p2pmem really is at this point.

>  - The dma ops provider must be able to tell if source memory is bar
>mapped and recover the pci device backing the mapping.

Do you mean to say that every dma-ops provider needs to be taught about
p2p backed pages? I was hoping we could have dma_map_* just use special
p2p dma-ops if it was passed p2p pages (though there are some
complications to this too).

> At least this is what we'd like in RDMA :)
> 
> FWIW, RDMA probably wouldn't want to use a p2mem device either, we
> already have APIs that map BAR memory to user space, and would like to
> keep using them. A 'enable P2P for bar' helper function sounds better
> to me.

Well, in the end that will likely come down to just devm_memremap_pages
with some (presently undecided) struct device that can be used to get
special p2p dma-ops for the bus.

Logan


Re: [PATCH 16/22] xen-blkfront: Make use of the new sg_map helper function

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 09:50 AM, Konrad Rzeszutek Wilk wrote:
> I am not sure if you know, but you can add on each patch the respective
> maintainer via 'CC'. That way you can have certain maintainers CCed only
> on the subsystems they cover. You put it after (or before) your SoB and
> git send-email happilly picks it up.

Yes, but I've seen some maintainers complain when they receive a patch
with no context (ie. cover letter and first patch). So I chose to do it
this way. I expect in this situation, no matter what you do, someone is
going to complain about the approach chosen.

Thanks anyway for the tip.

Logan


Re: [PATCH 05/22] drm/i915: Make use of the new sg_map helper function

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 12:44 AM, Daniel Vetter wrote:
> On Thu, Apr 13, 2017 at 04:05:18PM -0600, Logan Gunthorpe wrote:
>> This is a single straightforward conversion from kmap to sg_map.
>>
>> Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
> 
> Acked-by: Daniel Vetter <daniel.vet...@ffwll.ch>
> 
> Probably makes sense to merge through some other tree, but please be aware
> of the considerable churn rate in i915 (i.e. make sure your tree is in
> linux-next before you send a pull request for this). Plane B would be to
> get the prep patch in first and then merge the i915 conversion one kernel
> release later.

Yes, as per what I said in my cover letter, I was leaning towards a
"Plan B" style approach.

Logan


Re: [PATCH 16/22] xen-blkfront: Make use of the new sg_map helper function

2017-04-18 Thread Logan Gunthorpe


On 18/04/17 08:27 AM, Konrad Rzeszutek Wilk wrote:
> Interesting that you didn't CC any of the maintainers. Could you 
> do that in the future please?

Please read the cover letter. The distribution list for the patchset
would have been way too large to cc every maintainer (even as limited as
it was, I had mailing lists yelling at me). My plan was to get buy in
for the first patch, get it merged and resend the rest independently to
their respective maintainers. Of course, though, I'd be open to other
suggestions.

>>>
>>> Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
>>> ---
>>>  drivers/block/xen-blkfront.c | 33 +++--
>>>  1 file changed, 27 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
>>> index 5067a0a..7dcf41d 100644
>>> --- a/drivers/block/xen-blkfront.c
>>> +++ b/drivers/block/xen-blkfront.c
>>> @@ -807,8 +807,19 @@ static int blkif_queue_rw_req(struct request *req, 
>>> struct blkfront_ring_info *ri
>>> BUG_ON(sg->offset + sg->length > PAGE_SIZE);
>>>
>>> if (setup.need_copy) {
>>> -   setup.bvec_off = sg->offset;
>>> -   setup.bvec_data = kmap_atomic(sg_page(sg));
>>> +   setup.bvec_off = 0;
>>> +   setup.bvec_data = sg_map(sg, SG_KMAP_ATOMIC);
>>> +   if (IS_ERR(setup.bvec_data)) {
>>> +   /*
>>> +* This should really never happen unless
>>> +* the code is changed to use memory that is
>>> +* not mappable in the sg. Seeing there is a
>>> +* questionable error path out of here,
>>> +* we WARN.
>>> +*/
>>> +   WARN(1, "Non-mappable memory used in sg!");
>>> +   return 1;
>>> +   }
>> ...
>>
>> Perhaps add a flag to mark failure as 'unexpected' and trace (and panic?)
>> inside sg_map().

Thanks, that's a good suggestion. I'll make the change for v2.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-18 Thread Logan Gunthorpe


On 17/04/17 12:04 PM, Jerome Glisse wrote:
> I disagree here. I would rather see Peer-to-Peer mapping as a form
> of helper so that device driver can opt-in for multiple mecanisms
> concurrently. Like HMM and p2p.

I'm not against moving some of the common stuff into a library. It
sounds like the problems p2pmem solves don't overlap much with the
problems of the GPU and moving the stuff we have in common somewhere
else seems sensible.

> Also it seems you are having a static vision for p2p. For GPU and
> network the use case is you move some buffer into the device memory
> and then you create mapping for some network adapter while the buffer
> is in device memory. But this is only temporary and buffer might
> move to different device memory. So usecase is highly dynamic (well
> mapping lifetime is still probably few second/minutes).

I feel like you will need to pin the memory while it's the target of a
DMA transaction. If some network peer is sending you data and you just
invalidated the memory it is headed to then you are just going to break
applications. But really this isn't our concern: the memory we are using
with this work will be static and not prone to disappearing.

> I see no reason for exposing sysfs tree to userspace for all this.
> This isn't too dynamic, either 2 devices can access each others
> memory, either they can't. This can be hidden through the device
> kernel API. Again for GPU the idea is that it is always do-able
> in the sense that when it is not you fallback to using system
> memory.

The user has to make a decision to use it or not. This is an
optimization with significant trade-offs that may differ significantly
based on system design.

> Yes this could conflict and that's why i would rather see this as a set
> of helper like HMM is doing. So device driver can opt-in HMM and p2pmem
> at the same time.

I don't understand how that addresses the conflict. We need to each be
using unique and identifiable struct devices in the ZONE_DEVICE
dev_pagemap so we don't apply p2p dma mappings to hmm memory and vice-versa.

> This seems to duplicate things that already exist in each individual
> driver. If a device has memory than device driver already have some
> form of memory management and most likely expose some API to userspace
> to allow program to use that memory.

> Peer to peer DMA mapping is orthogonal to memory management, it is
> an optimization ie you have some buffer allocated through some device
> driver specific IOCTL and now you want some other device to directly
> access it. Having to first do another allocation in a different device
> driver for that to happen seems overkill.

The devices we are working with are adding memory specifically for
enabling p2p applications. The memory is new and there are no allocators
for any of it yet.

Also note: we've gotten _significant_ push back against exposing any of
this memory to userspace. Letting the user unknowingly have to deal with
the issues of iomem is not anything anyone wants to see. Thus we are
dealing with in-kernel users only and they need a common interface to
get the memory from.

> What you really want from device driver point of view is an helper to
> first tell you if 2 device can access each other and second an helper
> that allow the second device to import the other device memory to allow
> direct access.

Well, actually it's a bit more complicated than that but essentially
correct: There can be N devices in the mix and quite likely another
driver completely separate from all N devices. (eg. for our main use
case we have N nvme cards being talked to through an RDMA NIC with it
all being coordinated by the nvme-target driver).


> Discovering possible peer is a onetime only thing and designing around
> that is wrong in my view. There is already existing hierarchy in kernel
> for that in the form of the bus hierarchy (i am thinking pci bus here).
> So there is already existing way to discover this and you are just
> duplicating informations here.

I really don't see the solution you are proposing here. Have the user
specify a pci device name and just have them guess which ones have
suitable memory? Or do they have to walk the entire pci tree to find
ones that have such memory? There was no "duplicate" information created
by our patch set.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-17 Thread Logan Gunthorpe


On 17/04/17 03:11 PM, Benjamin Herrenschmidt wrote:
> Is it ? Again, you create a "concept" the user may have no idea about,
> "p2pmem memory". So now any kind of memory buffer on a device can could
> be use for p2p but also potentially a bunch of other things becomes
> special and called "p2pmem" ...

The user is going to have to have an idea about it if they are designing
systems to make use of it. I've said it before many times: this is an
optimization with significant trade-offs so the user does have to make
decisions regarding when to enable it.


> But what do you have in p2pmem that somebody benefits from. Again I
> don't understand what that "p2pmem" device buys you in term of
> functionality vs. having the device just instanciate the pages.

Well thanks for just taking a big shit on all of our work without even
reading the patches. Bravo.

> Now having some kind of way to override the dma_ops, yes I do get that,
> and it could be that this "p2pmem" is typically the way to do it, but
> at the moment you don't even have that. So I'm a bit at a loss here.

Yes, we've already said many times that this is something we will need
to add.

> But it doesn't *have* to be. Again, take my GPU example. The fact that
> a NIC might be able to DMA into it doesn't make it specifically "p2p
> memory".

Just because you use it for other things doesn't mean it can't also
provide the service of a "p2pmem" device.

> So now your "p2pmem" device needs to also be laid out on top of those
> MMIO registers ? It's becoming weird.

Yes, Max Gurtovoy has also expressed an interest in expanding this work
to cover things other than memory. He's suggested simply calling it a
p2p device, but until we figure out what exactly that all means we can't
really finalize a name.

> See, basically, doing peer 2 peer between devices has 3 main challenges
> today: The DMA API needing struct pages, the MMIO translation issues
> and the IOMMU translation issues.
> 
> You seem to create that added device as some kind of "owner" for the
> struct pages, solving #1, but leave #2 and #3 alone.

Well there are other challenges too. Like figuring out when it's
appropriate to use, tying together the device that provides the memory
with the driver tring to use it in DMA transactions, etc, etc. Our patch
set tackles these latter issues.

> If we go down that path, though, rather than calling it p2pmem I would
> call it something like dma_target which I find much clearer especially
> since it doesn't have to be just memory.

I'm not set on the name. My arguments have been specifically for the
existence of an independent struct device. But I'm not really interested
in getting into bike shedding arguments over what to call it at this
time when we don't even really know what it's going to end up doing in
the end.

> The memory allocation should be a completely orthogonal and separate
> thing yes. You are conflating two completely different things now into
> a single concept.

Well we need a uniform way for a driver trying to coordinate a p2p dma
to find and obtain memory from devices that supply it. We are not
dealing with GPUs that already have complicated allocators. We are
dealing with people adding memory to their devices for the _sole_
purpose of enabling p2p transfers. So having a common allocation setup
is seen as a benefit to us.

Logan




Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-17 Thread Logan Gunthorpe


On 17/04/17 11:04 AM, Dan Williams wrote:
>> Yes, in this scheme, it needs an additional p2pmem child. Why is that an
>> issue? It certainly makes it a lot easier for the user to understand the
>> p2pmem memory in the system (through the sysfs tree) and reason about
>> the topology and when to use it. This is important.
> 
> I think you want to go the other way in the hierarchy and find a
> shared *parent* to land the p2pmem capability. Because that same agent
> is going to be responsible handling address translation for the peers.
>
> Peer-dma is always going to be a property of the bus and not the end
> devices. Requiring each bus implementation to explicitly enable
> peer-to-peer support is a feature not a bug.
> 
> We shouldn't design for some future possible use case. Solve it for
> pci and when / if another bus comes along then look at a more generic
> abstraction.

Thanks Dan, these are some good points. Wedding it closer to the PCI
code makes more sense to me now. I'd still think you'd want some struct
device though to appear in the device hierarchy and allow reasoning
about topology.

Logan



Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-17 Thread Logan Gunthorpe


On 17/04/17 01:20 AM, Benjamin Herrenschmidt wrote:
> But is it ? For example take a GPU, does it, in your scheme, need an
> additional "p2pmem" child ? Why can't the GPU driver just use some
> helper to instantiate the necessary struct pages ? What does having an
> actual "struct device" child buys you ?

Yes, in this scheme, it needs an additional p2pmem child. Why is that an
issue? It certainly makes it a lot easier for the user to understand the
p2pmem memory in the system (through the sysfs tree) and reason about
the topology and when to use it. This is important.

> 
>> 2) In order to create the struct pages we use the ZONE_DEVICE
>> infrastructure which requires a struct device. (See
>> devm_memremap_pages.)
> 
> Yup, but you already have one in the actual pci_dev ... What is the
> benefit of adding a second one ?

But that would tie all of this very tightly to be pci only and may get
hard to differentiate if more users of ZONE_DEVICE crop up who happen to
be using a pci device. Having a specific class for this makes it very
clear how this memory would be handled. For example, although I haven't
looked into it, this could very well be a point of conflict with HMM. If
they were to use the pci device to populate the dev_pagemap then we
couldn't also use the pci device. I feel it's much better for users of
dev_pagemap to have their struct devices they own to avoid such conflicts.

> 
>>  This amazingly gets us the get_dev_pagemap
>> architecture which also uses a struct device. So by using a p2pmem
>> device we can go from struct page to struct device to p2pmem device
>> quickly and effortlessly.
> 
> Which isn't terribly useful in itself right ? What you care about is
> the "enclosing" pci_dev no ? Or am I missing something ?

Sure it is. What if we want to someday support p2pmem that's on another bus?

>> 3) You wouldn't want to use the pci's struct device because it doesn't
>> really describe what's going on. For example, there may be multiple
>> devices on the pci device in question: eg. an NVME card and some p2pmem.
> 
> What is "some p2pmem" ?
>> Or it could be a NIC with some p2pmem.
> 
> Again what is "some p2pmem" ?

Some device local memory intended for use as a DMA target from a
neighbour device or itself. On a PCI device, this would be a BAR, or a
portion of a BAR with memory behind it.

Keep in mind device classes tend to carve out common use cases and don't
have a one to one mapping with a physical pci card.

> That a device might have some memory-like buffer space is all well and
> good but does it need to be specifically distinguished at the device
> level ? It could be inherent to what the device is... for example again
> take the GPU example, why would you call the FB memory "p2pmem" ? 

Well if you are using it for p2p transactions why wouldn't you call it
p2pmem? There's no technical downside here except some vague argument
over naming. Once registered as p2pmem, that device will handle all the
dma map stuff for you and have a central obvious place to put code which
helps decide whether to use it or not based on topology.

I can certainly see an issue you'd have with the current RFC in that the
p2pmem device currently also handles memory allocation which a GPU would
 want to do itself. There are plenty of solutions to this though: we
could provide hooks for the parent device to override allocation or
something like that. However, the use cases I'm concerned with don't do
their own allocation so that is an important feature for them.

> Again I'm not sure why it needs to "instanciate a p2pmem" device. Maybe
> it's the term "p2pmem" that offputs me. If p2pmem allowed to have a
> standard way to lookup the various offsets etc... I mentioned earlier,
> then yes, it would make sense to have it as a staging point. As-is, I
> don't know. 

Well of course, at some point it would have a standard way to lookup
offsets and figure out what's necessary for a mapping. We wouldn't make
that separate from this, that would make no sense.

I also forgot:

4) We need someway in the kernel to configure drivers that use p2pmem.
That means it needs a unique name that the user can understand, lookup
and pass to other drivers. Then a way for those drivers to find it in
the system. A specific device class gets that for us in a very simple
fashion. We also don't want to have drivers like nvmet having to walk
every pci device to figure out where the p2p memory is and whether it
can use it.

IMO there are many clear benefits here and you haven't really offered an
alternative that provides the same features and potential for future use
cases.

Logan



Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-16 Thread Logan Gunthorpe


On 16/04/17 04:32 PM, Benjamin Herrenschmidt wrote:
>> I'll consider this. Given the fact I can use your existing
>> get_dev_pagemap infrastructure to look up the p2pmem device this
>> probably isn't as hard as I thought it would be anyway (we probably
>> don't even need a page flag). We'd just have lookup the dev_pagemap,
>> test if it's a p2pmem device, and if so, call a p2pmem_dma_map function
>> which could apply the offset or do any other arch specific logic (if
>> necessary).
> 
> I'm still not 100% why do you need a "p2mem device" mind you ...

Well, you don't "need" it but it is a design choice that I think makes a
lot of sense for the following reasons:

1) p2pmem is in fact a device on the pci bus. A pci driver will need to
set it up and create the device and thus it will have a natural parent
pci device. Instantiating a struct device for it means it will appear in
the device hierarchy and one can use that to reason about its position
in the topology.

2) In order to create the struct pages we use the ZONE_DEVICE
infrastructure which requires a struct device. (See
devm_memremap_pages.) This amazingly gets us the get_dev_pagemap
architecture which also uses a struct device. So by using a p2pmem
device we can go from struct page to struct device to p2pmem device
quickly and effortlessly.

3) You wouldn't want to use the pci's struct device because it doesn't
really describe what's going on. For example, there may be multiple
devices on the pci device in question: eg. an NVME card and some p2pmem.
Or it could be a NIC with some p2pmem. Or it could just be p2pmem by
itself. And the logic to figure out what memory is available and where
the address is will be non-standard so it's really straightforward to
have any pci driver just instantiate a p2pmem device.

It is probably worth you reading the RFC patches at this point to get a
better feel for this.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-16 Thread Logan Gunthorpe


On 16/04/17 09:44 AM, Dan Williams wrote:
> I think we very much want the dma mapping layer to be in the way.
> It's the only sane semantic we have to communicate this translation.

Yes, I wasn't proposing bypassing that layer, per say. I just meant that
the layer would, in the end, have to return the address without any
translations.

> The difference is that there was nothing fundamental in the core
> design of pmem + DAX that prevented other archs from growing pmem
> support. THP and memory hotplug existed on other architectures and
> they just need to plug in their arch-specific enabling. p2p support
> needs the same starting point of something more than one architecture
> can plug into, and handling the bus address offset case needs to be
> incorporated into the design.

I don't think there's a difference there either. There'd have been
nothing fundamental in our core design that says offsets couldn't have
been added later.

> pmem + dax did not change the meaning of what a dma_addr_t is, p2p does.

I don't think p2p actually really changes the meaning of dma_addr_t
either. We are just putting addresses in there that weren't used
previously. Our RFC makes no changes to anything even remotely related
to dma_addr_t.

> I think you need to give other archs a chance to support this with a
> design that considers the offset case as a first class citizen rather
> than an afterthought.

I'll consider this. Given the fact I can use your existing
get_dev_pagemap infrastructure to look up the p2pmem device this
probably isn't as hard as I thought it would be anyway (we probably
don't even need a page flag). We'd just have lookup the dev_pagemap,
test if it's a p2pmem device, and if so, call a p2pmem_dma_map function
which could apply the offset or do any other arch specific logic (if
necessary).

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-16 Thread Logan Gunthorpe


On 16/04/17 09:53 AM, Dan Williams wrote:
> ZONE_DEVICE allows you to redirect via get_dev_pagemap() to retrieve
> context about the physical address in question. I'm thinking you can
> hang bus address translation data off of that structure. This seems
> vaguely similar to what HMM is doing.

Thanks! I didn't realize you had the infrastructure to look up a device
from a pfn/page. That would really come in handy for us.

Logan



Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-15 Thread Logan Gunthorpe


On 15/04/17 04:17 PM, Benjamin Herrenschmidt wrote:
> You can't. If the iommu is on, everything is remapped. Or do you mean
> to have dma_map_* not do a remapping ?

Well, yes, you'd have to change the code so that iomem pages do not get
remapped and the raw BAR address is passed to the DMA engine. I said
specifically we haven't done this at this time but it really doesn't
seem like an unsolvable problem. It is something we will need to address
before a proper patch set is posted though.

> That's the problem again, same as before, for that to work, the
> dma_map_* ops would have to do something special that depends on *both*
> the source and target device.

No, I don't think you have to do things different based on the source.
Have the p2pmem device layer restrict allocating p2pmem based on the
devices in use (similar to how the RFC code works now) and when the dma
mapping code sees iomem pages it just needs to leave the address alone
so it's used directly by the dma in question.

It's much better to make the decision on which memory to use when you
allocate it. If you wait until you map it, it would be a pain to fall
back to system memory if it doesn't look like it will work. So, if when
you allocate it, you know everything will work you just need the dma
mapping layer to stay out of the way.

> The dma_ops today are architecture specific and have no way to
> differenciate between normal and those special P2P DMA pages.

Correct, unless Dan's idea works (which will need some investigation),
we'd need a flag in struct page or some other similar method to
determine that these are special iomem pages.

>> Though if it does, I'd expect
>> everything would still work you just wouldn't get the performance or
>> traffic flow you are looking for. We've been testing with the software
>> iommu which doesn't have this problem.
> 
> So first, no, it's more than "you wouldn't get the performance". On
> some systems it may also just not work. Also what do you mean by "the
> SW iommu doesn't have this problem" ? It catches the fact that
> addresses don't point to RAM and maps differently ?

I haven't tested it but I can't imagine why an iommu would not correctly
map the memory in the bar. But that's _way_ beside the point. We
_really_ want to avoid that situation anyway. If the iommu maps the
memory it defeats what we are trying to accomplish.

I believe the sotfware iommu only uses bounce buffers if the DMA engine
in use cannot address the memory. So in most cases, with modern
hardware, it just passes the BAR's address to the DMA engine and
everything works. The code posted in the RFC does in fact work without
needing to do any of this fussing.

>>> The problem is that the latter while seemingly easier, is also slower
>>> and not supported by all platforms and architectures (for example,
>>> POWER currently won't allow it, or rather only allows a store-only
>>> subset of it under special circumstances).
>>
>> Yes, I think situations where we have to cross host bridges will remain
>> unsupported by this work for a long time. There are two many cases where
>> it just doesn't work or it performs too poorly to be useful.
> 
> And the situation where you don't cross bridges is the one where you
> need to also take into account the offsets.

I think for the first incarnation we will just not support systems that
have offsets. This makes things much easier and still supports all the
use cases we are interested in.

> So you are designing something that is built from scratch to only work
> on a specific limited category of systems and is also incompatible with
> virtualization.

Yes, we are starting with support for specific use cases. Almost all
technology starts that way. Dax has been in the kernel for years and
only recently has someone submitted patches for it to support pmem on
powerpc. This is not unusual. If you had forced the pmem developers to
support all architectures in existence before allowing them upstream
they couldn't possibly be as far as they are today.

Virtualization specifically would be a _lot_ more difficult than simply
supporting offsets. The actual topology of the bus will probably be lost
on the guest OS and it would therefor have a difficult time figuring out
when it's acceptable to use p2pmem. I also have a difficult time seeing
a use case for it and thus I have a hard time with the argument that we
can't support use cases that do want it because use cases that don't
want it (perhaps yet) won't work.

> This is an interesting experiement to look at I suppose, but if you
> ever want this upstream I would like at least for you to develop a
> strategy to support the wider case, if not an actual implementation.

I think there are plenty of avenues forward to support offsets, etc.
It's just work. Nothing we'd be proposing would be incompatible with it.
We just don't want to have to do it all upfront especially when no one
really knows how well various architecture's hardware supports this or
if 

Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-15 Thread Logan Gunthorpe


On 15/04/17 09:01 PM, Benjamin Herrenschmidt wrote:
> Are ZONE_DEVICE pages identifiable based on the struct page alone ? (a
> flag ?)

Well you can't use ZONE_DEVICE as an indicator. They may be regular RAM,
(eg. pmem). It would need a separate flag indicating it is backed by iomem.

Logan


Re: [PATCH 09/22] dm-crypt: Make use of the new sg_map helper in 4 call sites

2017-04-15 Thread Logan Gunthorpe
Thanks for the information Milan.

On 15/04/17 06:10 AM, Milan Broz wrote:
> I think your patch is ok (if it is just plain conversion), if it is
> really needed, we can switch to ahash later in follow-up patch.

Sounds good to me.

> p.s.
> there is a lot of lists on cc, but for this patch is missing dm-devel, 
> dmcrypt changes
> need to go through Mike's tree (I added dm-devel to cc:)

Oh, sorry, I thought I had included all the lists. My hope however would
be to get the first patch merged and then re-send the remaining patches
to their respective maintainers. So that would have happened later. It's
hard to manage patches otherwise with such large distribution lists.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-15 Thread Logan Gunthorpe
Thanks, Benjamin, for the summary of some of the issues.

On 14/04/17 04:07 PM, Benjamin Herrenschmidt wrote
> So I assume the p2p code provides a way to address that too via special
> dma_ops ? Or wrappers ?

Not at this time. We will probably need a way to ensure the iommus do
not attempt to remap these addresses. Though if it does, I'd expect
everything would still work you just wouldn't get the performance or
traffic flow you are looking for. We've been testing with the software
iommu which doesn't have this problem.

> The problem is that the latter while seemingly easier, is also slower
> and not supported by all platforms and architectures (for example,
> POWER currently won't allow it, or rather only allows a store-only
> subset of it under special circumstances).

Yes, I think situations where we have to cross host bridges will remain
unsupported by this work for a long time. There are two many cases where
it just doesn't work or it performs too poorly to be useful.

> I don't fully understand how p2pmem "solves" that by creating struct
> pages. The offset problem is one issue. But there's the iommu issue as
> well, the driver cannot just use the normal dma_map ops.

We are not using a proper iommu and we are dealing with systems that
have zero offset. This case is also easily supported. I expect fixing
the iommus to not map these addresses would also be reasonably achievable.

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-14 Thread Logan Gunthorpe


On 14/04/17 05:37 AM, Benjamin Herrenschmidt wrote:
> I object to designing a subsystem that by design cannot work on whole
> categories of architectures out there.

Hardly. That's extreme. We'd design a subsystem that works for the easy
cases and needs more work to support the offset cases. It would not be
designed in such a way that it could _never_ support those
architectures. It would simply be such that it only permits use by the
cases that are known to work. Then those cases could be expanded as time
goes on and people work on adding more support.

There's tons of stuff that needs to be done to get this upstream. I'd
rather not require it to work for every possible architecture from the
start. The testing alone would be impossible. Many subsystems start by
working for x86 first and then adding support in other architectures
later. (Often with that work done by the people who care about those
systems and actually have the hardware to test with.)

Logan


Re: [PATCH 10/22] staging: unisys: visorbus: Make use of the new sg_map helper function

2017-04-14 Thread Logan Gunthorpe
Great, thanks!

Logan

On 14/04/17 10:07 AM, Kershner, David A wrote:
> Can you add Acked-by for this patch? 
> 
> Acked-by: David Kershner 
> 
> Tested on s-Par and no problems. 
> 
> Thanks,
> David Kershner
> 
>> ---
>>  drivers/staging/unisys/visorhba/visorhba_main.c | 12 +++-
>>  1 file changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/staging/unisys/visorhba/visorhba_main.c
>> b/drivers/staging/unisys/visorhba/visorhba_main.c
>> index 0ce92c8..2d8c8bc 100644
>> --- a/drivers/staging/unisys/visorhba/visorhba_main.c
>> +++ b/drivers/staging/unisys/visorhba/visorhba_main.c
>> @@ -842,7 +842,6 @@ do_scsi_nolinuxstat(struct uiscmdrsp *cmdrsp, struct
>> scsi_cmnd *scsicmd)
>>  struct scatterlist *sg;
>>  unsigned int i;
>>  char *this_page;
>> -char *this_page_orig;
>>  int bufind = 0;
>>  struct visordisk_info *vdisk;
>>  struct visorhba_devdata *devdata;
>> @@ -869,11 +868,14 @@ do_scsi_nolinuxstat(struct uiscmdrsp *cmdrsp,
>> struct scsi_cmnd *scsicmd)
>>
>>  sg = scsi_sglist(scsicmd);
>>  for (i = 0; i < scsi_sg_count(scsicmd); i++) {
>> -this_page_orig = kmap_atomic(sg_page(sg + i));
>> -this_page = (void *)((unsigned long)this_page_orig |
>> - sg[i].offset);
>> +this_page = sg_map(sg + i, SG_KMAP_ATOMIC);
>> +if (IS_ERR(this_page)) {
>> +scsicmd->result = DID_ERROR << 16;
>> +return;
>> +}
>> +
>>  memcpy(this_page, buf + bufind, sg[i].length);
>> -kunmap_atomic(this_page_orig);
>> +sg_unmap(sg + i, this_page, SG_KMAP_ATOMIC);
>>  }
>>  } else {
>>  devdata = (struct visorhba_devdata *)scsidev->host-
>>> hostdata;
>> --
>> 2.1.4
> 


Re: [PATCH 04/22] target: Make use of the new sg_map function at 16 call sites (fwd)

2017-04-14 Thread Logan Gunthorpe
Thanks Julia. I missed that and I'll fix it in my series.

Logan

On 14/04/17 09:19 AM, Julia Lawall wrote:
> It looks like >cmdr_lock should be released at line 512 if it has
> not been released otherwise.  The lock was taken at line 438.
> 
> julia
> 
> -- Forwarded message --
> Date: Fri, 14 Apr 2017 22:21:44 +0800
> From: kbuild test robot <fengguang...@intel.com>
> To: kbu...@01.org
> Cc: Julia Lawall <julia.law...@lip6.fr>
> Subject: Re: [PATCH 04/22] target: Make use of the new sg_map function at 16
> call sites
> 
> Hi Logan,
> 
> [auto build test WARNING on scsi/for-next]
> [also build test WARNING on v4.11-rc6]
> [cannot apply to next-20170413]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
> 
> url:
> https://github.com/0day-ci/linux/commits/Logan-Gunthorpe/Introduce-common-scatterlist-map-function/20170414-142518
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git for-next
> :: branch date: 8 hours ago
> :: commit date: 8 hours ago
> 
>>> drivers/target/target_core_user.c:512:2-8: preceding lock on line 438
>drivers/target/target_core_user.c:512:2-8: preceding lock on line 471
> 
> git remote add linux-review https://github.com/0day-ci/linux
> git remote update linux-review
> git checkout 78082134e7afdc59d744eb8d2def5c588e89c378
> vim +512 drivers/target/target_core_user.c
> 
> 7c9e7a6f Andy Grover  2014-10-01  432 
> sizeof(struct tcmu_cmd_entry));
> 7c9e7a6f Andy Grover  2014-10-01  433 command_size = base_command_size
> 7c9e7a6f Andy Grover  2014-10-01  434 + 
> round_up(scsi_command_size(se_cmd->t_task_cdb), TCMU_OP_ALIGN_SIZE);
> 7c9e7a6f Andy Grover  2014-10-01  435
> 7c9e7a6f Andy Grover  2014-10-01  436 WARN_ON(command_size & 
> (TCMU_OP_ALIGN_SIZE-1));
> 7c9e7a6f Andy Grover  2014-10-01  437
> 7c9e7a6f Andy Grover  2014-10-01 @438 spin_lock_irq(>cmdr_lock);
> 7c9e7a6f Andy Grover  2014-10-01  439
> 7c9e7a6f Andy Grover  2014-10-01  440 mb = udev->mb_addr;
> 7c9e7a6f Andy Grover  2014-10-01  441 cmd_head = mb->cmd_head % 
> udev->cmdr_size; /* UAM */
> 26418649 Sheng Yang   2016-02-26  442 data_length = 
> se_cmd->data_length;
> 26418649 Sheng Yang   2016-02-26  443 if (se_cmd->se_cmd_flags & 
> SCF_BIDI) {
> 26418649 Sheng Yang   2016-02-26  444 
> BUG_ON(!(se_cmd->t_bidi_data_sg && se_cmd->t_bidi_data_nents));
> 26418649 Sheng Yang   2016-02-26  445 data_length += 
> se_cmd->t_bidi_data_sg->length;
> 26418649 Sheng Yang   2016-02-26  446 }
> 554617b2 Andy Grover  2016-08-25  447 if ((command_size > 
> (udev->cmdr_size / 2)) ||
> 554617b2 Andy Grover  2016-08-25  448 data_length > 
> udev->data_size) {
> 554617b2 Andy Grover  2016-08-25  449 pr_warn("TCMU: Request 
> of size %zu/%zu is too big for %u/%zu "
> 3d9b9555 Andy Grover  2016-08-25  450 "cmd ring/data 
> area\n", command_size, data_length,
> 7c9e7a6f Andy Grover  2014-10-01  451 
> udev->cmdr_size, udev->data_size);
> 554617b2 Andy Grover  2016-08-25  452 
> spin_unlock_irq(>cmdr_lock);
> 554617b2 Andy Grover  2016-08-25  453 return 
> TCM_INVALID_CDB_FIELD;
> 554617b2 Andy Grover  2016-08-25  454 }
> 7c9e7a6f Andy Grover  2014-10-01  455
> 26418649 Sheng Yang   2016-02-26  456 while 
> (!is_ring_space_avail(udev, command_size, data_length)) {
> 7c9e7a6f Andy Grover  2014-10-01  457 int ret;
> 7c9e7a6f Andy Grover  2014-10-01  458 DEFINE_WAIT(__wait);
> 7c9e7a6f Andy Grover  2014-10-01  459
> 7c9e7a6f Andy Grover  2014-10-01  460 
> prepare_to_wait(>wait_cmdr, &__wait, TASK_INTERRUPTIBLE);
> 7c9e7a6f Andy Grover  2014-10-01  461
> 7c9e7a6f Andy Grover  2014-10-01  462 pr_debug("sleeping for 
> ring space\n");
> 7c9e7a6f Andy Grover  2014-10-01  463 
> spin_unlock_irq(>cmdr_lock);
> 7c9e7a6f Andy Grover  2014-10-01  464 ret = 
> schedule_timeout(msecs_to_jiffies(TCMU_TIME_OUT));
> 7c9e7a6f Andy Grover  2014-10-01  465 
> finish_wait(>wait_cmdr, &__wait);
> 7c9e7a6f Andy Grover  2014-10-01  466 if (!ret) {
> 7c9e7a6f Andy Grover  2014-10-01  467 pr_warn("tcmu: 
> command timed out\n");
> 02eb924f Andy Grover  2016-10-06  468 

Re: [PATCH 09/22] dm-crypt: Make use of the new sg_map helper in 4 call sites

2017-04-14 Thread Logan Gunthorpe


On 14/04/17 02:39 AM, Christoph Hellwig wrote:
> On Thu, Apr 13, 2017 at 04:05:22PM -0600, Logan Gunthorpe wrote:
>> Very straightforward conversion to the new function in all four spots.
> 
> I think the right fix here is to switch dm-crypt to the ahash API
> that takes a scatterlist.

Hmm, well I'm not sure I understand the code enough to make that
conversion. But I was looking at it. One tricky bit seems to be that
crypt_iv_lmk_one adds a seed, skips the first 16 bytes in the page and
then hashes another 16 bytes from other data. What would you do
construct a new sgl for it and pass it to the ahash api?

The other thing is crypt_iv_lmk_post also seems to modify the page after
the hash with a  crypto_xor so you'd still need at least one kmap in there.

Logan


Re: [PATCH 03/22] libiscsi: Make use of new the sg_map helper function

2017-04-14 Thread Logan Gunthorpe


On 14/04/17 02:36 AM, Christoph Hellwig wrote:
> On Thu, Apr 13, 2017 at 04:05:16PM -0600, Logan Gunthorpe wrote:
>> Convert the kmap and kmap_atomic uses to the sg_map function. We now
>> store the flags for the kmap instead of a boolean to indicate
>> atomicitiy. We also propogate a possible kmap error down and create
>> a new ISCSI_TCP_INTERNAL_ERR error type for this.
> 
> Can you split out the new error handling into a separate prep patch
> which should go to the iscsi maintainers ASAP?
> 

Yes, I can do that. I'd just have thought they'd want to see the use
case for the new error before accepting a patch like that...

Logan


Re: [PATCH 01/22] scatterlist: Introduce sg_map helper functions

2017-04-14 Thread Logan Gunthorpe


On 14/04/17 02:35 AM, Christoph Hellwig wrote:
>> +
>>  static inline int is_dma_buf_file(struct file *);
>>  
>>  struct dma_buf_list {
> 
> I think the right fix here is to rename the operation to unmap_atomic
> and send out a little patch for that ASAP.

Ok, I can do that next week.

> I'd rather have separate functions for kmap vs kmap_atomic instead of
> the flags parameter.  And while you're at it just always pass the 0
> offset parameter instead of adding a wrapper..
> 
> Otherwise this looks good to me.

I settled on the flags because I thought the interface could be expanded
to do more things like automatically copy iomem to a bounce buffer (with
a flag). It'd also be possible to add things like vmap and
physical_address to the interface which would cover even more sg_page
users. All the implementations would then share the common offset
calculations, and switching between them becomes a matter of changing a
couple flags.

If you're still not convinced by the above arguments  then I'll change
it but I did have reasons for choosing to do it this way.

I am fine with removing the offset versions. I will make that change.

Thanks,

Logan



Re: [PATCH 02/22] nvmet: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe


On 13/04/17 10:59 PM, Christoph Hellwig wrote:
> On Thu, Apr 13, 2017 at 04:05:15PM -0600, Logan Gunthorpe wrote:
>> This is a straight forward conversion in two places. Should kmap fail,
>> the code will return an INVALD_DATA error in the completion.
> 
> It really should be using nvmet_copy_from_sgl to make things safer,
> as we don't want to rely on any particular SG list layout.  In fact
> I'm pretty sure I did the conversion at some point, but it must never
> have made it upstream.

Ha, I did the conversion too a couple times for my RFC series. I can
change this patch to do that. Or maybe I'll just send a patch for that
separately seeing it doesn't depend on anything and is pretty simple. I
can do that next week.

Thanks,

Logan


Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-13 Thread Logan Gunthorpe


On 13/04/17 10:16 PM, Jason Gunthorpe wrote:
> I'd suggest just detecting if there is any translation in bus
> addresses anywhere and just hard disabling P2P on such systems.

That's a fantastic suggestion. It simplifies things significantly.
Unless there are any significant objections I think I will plan on doing
that.

> On modern hardware with 64 bit BARs there is very little reason to
> have translation, so I think this is a legacy feature.

Yes, p2pmem users are likely to be designing systems around it (ie
JBOFs) and not trying to shoehorn it onto legacy architectures.

At the very least, it makes sense to leave it out and if someone comes
along who cares they can put in the effort to support the address
translation.

Thanks,

Logan


[PATCH 15/22] scsi: libfc, csiostor: Change to sg_copy_buffer in two drivers

2017-04-13 Thread Logan Gunthorpe
These two drivers appear to duplicate the functionality of
sg_copy_buffer. So we clean them up to use the common code.

This helps us remove a couple of instances that would otherwise be
slightly tricky sg_map usages.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/scsi/csiostor/csio_scsi.c | 54 +++
 drivers/scsi/libfc/fc_libfc.c | 49 ---
 2 files changed, 14 insertions(+), 89 deletions(-)

diff --git a/drivers/scsi/csiostor/csio_scsi.c 
b/drivers/scsi/csiostor/csio_scsi.c
index a1ff75f..bd9d062 100644
--- a/drivers/scsi/csiostor/csio_scsi.c
+++ b/drivers/scsi/csiostor/csio_scsi.c
@@ -1489,60 +1489,14 @@ static inline uint32_t
 csio_scsi_copy_to_sgl(struct csio_hw *hw, struct csio_ioreq *req)
 {
struct scsi_cmnd *scmnd  = (struct scsi_cmnd *)csio_scsi_cmnd(req);
-   struct scatterlist *sg;
-   uint32_t bytes_left;
-   uint32_t bytes_copy;
-   uint32_t buf_off = 0;
-   uint32_t start_off = 0;
-   uint32_t sg_off = 0;
-   void *sg_addr;
-   void *buf_addr;
struct csio_dma_buf *dma_buf;
+   size_t copied;
 
-   bytes_left = scsi_bufflen(scmnd);
-   sg = scsi_sglist(scmnd);
dma_buf = (struct csio_dma_buf *)csio_list_next(>gen_list);
+   copied = sg_copy_from_buffer(scsi_sglist(scmnd), scsi_sg_count(scmnd),
+dma_buf->vaddr, scsi_bufflen(scmnd));
 
-   /* Copy data from driver buffer to SGs of SCSI CMD */
-   while (bytes_left > 0 && sg && dma_buf) {
-   if (buf_off >= dma_buf->len) {
-   buf_off = 0;
-   dma_buf = (struct csio_dma_buf *)
-   csio_list_next(dma_buf);
-   continue;
-   }
-
-   if (start_off >= sg->length) {
-   start_off -= sg->length;
-   sg = sg_next(sg);
-   continue;
-   }
-
-   buf_addr = dma_buf->vaddr + buf_off;
-   sg_off = sg->offset + start_off;
-   bytes_copy = min((dma_buf->len - buf_off),
-   sg->length - start_off);
-   bytes_copy = min((uint32_t)(PAGE_SIZE - (sg_off & ~PAGE_MASK)),
-bytes_copy);
-
-   sg_addr = kmap_atomic(sg_page(sg) + (sg_off >> PAGE_SHIFT));
-   if (!sg_addr) {
-   csio_err(hw, "failed to kmap sg:%p of ioreq:%p\n",
-   sg, req);
-   break;
-   }
-
-   csio_dbg(hw, "copy_to_sgl:sg_addr %p sg_off %d buf %p len %d\n",
-   sg_addr, sg_off, buf_addr, bytes_copy);
-   memcpy(sg_addr + (sg_off & ~PAGE_MASK), buf_addr, bytes_copy);
-   kunmap_atomic(sg_addr);
-
-   start_off +=  bytes_copy;
-   buf_off += bytes_copy;
-   bytes_left -= bytes_copy;
-   }
-
-   if (bytes_left > 0)
+   if (copied != scsi_bufflen(scmnd))
return DID_ERROR;
else
return DID_OK;
diff --git a/drivers/scsi/libfc/fc_libfc.c b/drivers/scsi/libfc/fc_libfc.c
index d623d08..ce0805a 100644
--- a/drivers/scsi/libfc/fc_libfc.c
+++ b/drivers/scsi/libfc/fc_libfc.c
@@ -113,45 +113,16 @@ u32 fc_copy_buffer_to_sglist(void *buf, size_t len,
 u32 *nents, size_t *offset,
 u32 *crc)
 {
-   size_t remaining = len;
-   u32 copy_len = 0;
-
-   while (remaining > 0 && sg) {
-   size_t off, sg_bytes;
-   void *page_addr;
-
-   if (*offset >= sg->length) {
-   /*
-* Check for end and drop resources
-* from the last iteration.
-*/
-   if (!(*nents))
-   break;
-   --(*nents);
-   *offset -= sg->length;
-   sg = sg_next(sg);
-   continue;
-   }
-   sg_bytes = min(remaining, sg->length - *offset);
-
-   /*
-* The scatterlist item may be bigger than PAGE_SIZE,
-* but we are limited to mapping PAGE_SIZE at a time.
-*/
-   off = *offset + sg->offset;
-   sg_bytes = min(sg_bytes,
-  (size_t)(PAGE_SIZE - (off & ~PAGE_MASK)));
-   page_addr = kmap_atomic(sg_page(sg) + (off >> PAGE_SHIFT));
-   if (crc)
-   *crc = crc32(*crc, buf, sg_bytes);
-   memcpy((char *)page_addr + (off & ~PAGE_MASK), buf, sg_bytes);
- 

[PATCH 12/22] scsi: ipr, pmcraid, isci: Make use of the new sg_map helper in 4 call sites

2017-04-13 Thread Logan Gunthorpe
Very straightforward conversion of three scsi drivers.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/scsi/ipr.c  | 27 ++-
 drivers/scsi/isci/request.c | 42 +-
 drivers/scsi/pmcraid.c  | 19 ---
 3 files changed, 51 insertions(+), 37 deletions(-)

diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index b29afaf..f98f251 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -3853,7 +3853,7 @@ static void ipr_free_ucode_buffer(struct ipr_sglist 
*sglist)
 static int ipr_copy_ucode_buffer(struct ipr_sglist *sglist,
 u8 *buffer, u32 len)
 {
-   int bsize_elem, i, result = 0;
+   int bsize_elem, i;
struct scatterlist *scatterlist;
void *kaddr;
 
@@ -3863,32 +3863,33 @@ static int ipr_copy_ucode_buffer(struct ipr_sglist 
*sglist,
scatterlist = sglist->scatterlist;
 
for (i = 0; i < (len / bsize_elem); i++, buffer += bsize_elem) {
-   struct page *page = sg_page([i]);
+   kaddr = sg_map([i], SG_KMAP);
+   if (IS_ERR(kaddr)) {
+   ipr_trace;
+   return PTR_ERR(kaddr);
+   }
 
-   kaddr = kmap(page);
memcpy(kaddr, buffer, bsize_elem);
-   kunmap(page);
+   sg_unmap([i], kaddr, SG_KMAP);
 
scatterlist[i].length = bsize_elem;
-
-   if (result != 0) {
-   ipr_trace;
-   return result;
-   }
}
 
if (len % bsize_elem) {
-   struct page *page = sg_page([i]);
+   kaddr = sg_map([i], SG_KMAP);
+   if (IS_ERR(kaddr)) {
+   ipr_trace;
+   return PTR_ERR(kaddr);
+   }
 
-   kaddr = kmap(page);
memcpy(kaddr, buffer, len % bsize_elem);
-   kunmap(page);
+   sg_unmap([i], kaddr, SG_KMAP);
 
scatterlist[i].length = len % bsize_elem;
}
 
sglist->buffer_len = len;
-   return result;
+   return 0;
 }
 
 /**
diff --git a/drivers/scsi/isci/request.c b/drivers/scsi/isci/request.c
index 47f66e9..66d6596 100644
--- a/drivers/scsi/isci/request.c
+++ b/drivers/scsi/isci/request.c
@@ -1424,12 +1424,14 @@ sci_stp_request_pio_data_in_copy_data_buffer(struct 
isci_stp_request *stp_req,
sg = task->scatter;
 
while (total_len > 0) {
-   struct page *page = sg_page(sg);
-
copy_len = min_t(int, total_len, sg_dma_len(sg));
-   kaddr = kmap_atomic(page);
-   memcpy(kaddr + sg->offset, src_addr, copy_len);
-   kunmap_atomic(kaddr);
+   kaddr = sg_map(sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(kaddr))
+   return SCI_FAILURE;
+
+   memcpy(kaddr, src_addr, copy_len);
+   sg_unmap(sg, kaddr, SG_KMAP_ATOMIC);
+
total_len -= copy_len;
src_addr += copy_len;
sg = sg_next(sg);
@@ -1771,14 +1773,16 @@ sci_io_request_frame_handler(struct isci_request *ireq,
case SCI_REQ_SMP_WAIT_RESP: {
struct sas_task *task = isci_request_access_task(ireq);
struct scatterlist *sg = >smp_task.smp_resp;
-   void *frame_header, *kaddr;
+   void *frame_header;
u8 *rsp;
 
sci_unsolicited_frame_control_get_header(>uf_control,
 frame_index,
 _header);
-   kaddr = kmap_atomic(sg_page(sg));
-   rsp = kaddr + sg->offset;
+   rsp = sg_map(sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(rsp))
+   return SCI_FAILURE;
+
sci_swab32_cpy(rsp, frame_header, 1);
 
if (rsp[0] == SMP_RESPONSE) {
@@ -1814,7 +1818,7 @@ sci_io_request_frame_handler(struct isci_request *ireq,
ireq->sci_status = 
SCI_FAILURE_CONTROLLER_SPECIFIC_IO_ERR;
sci_change_state(>sm, SCI_REQ_COMPLETED);
}
-   kunmap_atomic(kaddr);
+   sg_unmap(sg, rsp, SG_KMAP_ATOMIC);
 
sci_controller_release_frame(ihost, frame_index);
 
@@ -2919,15 +2923,18 @@ static void isci_request_io_request_complete(struct 
isci_host *ihost,
case SAS_PROTOCOL_SMP: {
struct scatterlist *sg = >smp_task.smp_req;
struct smp_req *smp_req;
-   void *kaddr;
 
dma_unmap_sg(>pdev->dev, sg, 1, DMA_TO_DEVICE);
 
/* need to sw

[PATCH 07/22] crypto: shash, caam: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
Very straightforward conversion to the new function in two crypto
drivers.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 crypto/shash.c| 9 ++---
 drivers/crypto/caam/caamalg.c | 8 +++-
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/crypto/shash.c b/crypto/shash.c
index 5e31c8d..2b7de94 100644
--- a/crypto/shash.c
+++ b/crypto/shash.c
@@ -283,10 +283,13 @@ int shash_ahash_digest(struct ahash_request *req, struct 
shash_desc *desc)
if (nbytes < min(sg->length, ((unsigned int)(PAGE_SIZE)) - offset)) {
void *data;
 
-   data = kmap_atomic(sg_page(sg));
-   err = crypto_shash_digest(desc, data + offset, nbytes,
+   data = sg_map(sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(data))
+   return PTR_ERR(data);
+
+   err = crypto_shash_digest(desc, data, nbytes,
  req->result);
-   kunmap_atomic(data);
+   sg_unmap(sg, data, SG_KMAP_ATOMIC);
crypto_yield(desc->flags);
} else
err = crypto_shash_init(desc) ?:
diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
index 9bc80eb..76b97de 100644
--- a/drivers/crypto/caam/caamalg.c
+++ b/drivers/crypto/caam/caamalg.c
@@ -89,7 +89,6 @@ static void dbg_dump_sg(const char *level, const char 
*prefix_str,
struct scatterlist *sg, size_t tlen, bool ascii)
 {
struct scatterlist *it;
-   void *it_page;
size_t len;
void *buf;
 
@@ -98,19 +97,18 @@ static void dbg_dump_sg(const char *level, const char 
*prefix_str,
 * make sure the scatterlist's page
 * has a valid virtual memory mapping
 */
-   it_page = kmap_atomic(sg_page(it));
-   if (unlikely(!it_page)) {
+   buf = sg_map(it, SG_KMAP_ATOMIC);
+   if (IS_ERR(buf)) {
printk(KERN_ERR "dbg_dump_sg: kmap failed\n");
return;
}
 
-   buf = it_page + it->offset;
len = min_t(size_t, tlen, it->length);
print_hex_dump(level, prefix_str, prefix_type, rowsize,
   groupsize, buf, len, ascii);
tlen -= len;
 
-   kunmap_atomic(it_page);
+   sg_unmap(it, buf, SG_KMAP_ATOMIC);
}
 }
 #endif
-- 
2.1.4



[PATCH 03/22] libiscsi: Make use of new the sg_map helper function

2017-04-13 Thread Logan Gunthorpe
Convert the kmap and kmap_atomic uses to the sg_map function. We now
store the flags for the kmap instead of a boolean to indicate
atomicitiy. We also propogate a possible kmap error down and create
a new ISCSI_TCP_INTERNAL_ERR error type for this.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/scsi/cxgbi/libcxgbi.c |  5 +
 drivers/scsi/libiscsi_tcp.c   | 32 
 include/scsi/libiscsi_tcp.h   |  3 ++-
 3 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
index bd7d39e..e38d0c1 100644
--- a/drivers/scsi/cxgbi/libcxgbi.c
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -1556,6 +1556,11 @@ static inline int read_pdu_skb(struct iscsi_conn *conn,
 */
iscsi_conn_printk(KERN_ERR, conn, "Invalid pdu or skb.");
return -EFAULT;
+   case ISCSI_TCP_INTERNAL_ERR:
+   pr_info("skb 0x%p, off %u, %d, TCP_INTERNAL_ERR.\n",
+   skb, offset, offloaded);
+   iscsi_conn_printk(KERN_ERR, conn, "Internal error.");
+   return -EFAULT;
case ISCSI_TCP_SEGMENT_DONE:
log_debug(1 << CXGBI_DBG_PDU_RX,
"skb 0x%p, off %u, %d, TCP_SEG_DONE, rc %d.\n",
diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c
index 63a1d69..a2427699 100644
--- a/drivers/scsi/libiscsi_tcp.c
+++ b/drivers/scsi/libiscsi_tcp.c
@@ -133,25 +133,23 @@ static void iscsi_tcp_segment_map(struct iscsi_segment 
*segment, int recv)
if (page_count(sg_page(sg)) >= 1 && !recv)
return;
 
-   if (recv) {
-   segment->atomic_mapped = true;
-   segment->sg_mapped = kmap_atomic(sg_page(sg));
-   } else {
-   segment->atomic_mapped = false;
-   /* the xmit path can sleep with the page mapped so use kmap */
-   segment->sg_mapped = kmap(sg_page(sg));
+   /* the xmit path can sleep with the page mapped so don't use atomic */
+   segment->sg_map_flags = recv ? SG_KMAP_ATOMIC : SG_KMAP;
+   segment->sg_mapped = sg_map(sg, segment->sg_map_flags);
+
+   if (IS_ERR(segment->sg_mapped)) {
+   segment->sg_mapped = NULL;
+   return;
}
 
-   segment->data = segment->sg_mapped + sg->offset + segment->sg_offset;
+   segment->data = segment->sg_mapped + segment->sg_offset;
 }
 
 void iscsi_tcp_segment_unmap(struct iscsi_segment *segment)
 {
if (segment->sg_mapped) {
-   if (segment->atomic_mapped)
-   kunmap_atomic(segment->sg_mapped);
-   else
-   kunmap(sg_page(segment->sg));
+   sg_unmap(segment->sg, segment->sg_mapped,
+ segment->sg_map_flags);
segment->sg_mapped = NULL;
segment->data = NULL;
}
@@ -304,6 +302,9 @@ iscsi_tcp_segment_recv(struct iscsi_tcp_conn *tcp_conn,
break;
}
 
+   if (segment->data)
+   return -EFAULT;
+
copy = min(len - copied, segment->size - segment->copied);
ISCSI_DBG_TCP(tcp_conn->iscsi_conn, "copying %d\n", copy);
memcpy(segment->data + segment->copied, ptr + copied, copy);
@@ -927,6 +928,13 @@ int iscsi_tcp_recv_skb(struct iscsi_conn *conn, struct 
sk_buff *skb,
  avail);
rc = iscsi_tcp_segment_recv(tcp_conn, segment, ptr, avail);
BUG_ON(rc == 0);
+   if (rc < 0) {
+   ISCSI_DBG_TCP(conn, "memory fault. Consumed %d\n",
+ consumed);
+   *status = ISCSI_TCP_INTERNAL_ERR;
+   goto skb_done;
+   }
+
consumed += rc;
 
if (segment->total_copied >= segment->total_size) {
diff --git a/include/scsi/libiscsi_tcp.h b/include/scsi/libiscsi_tcp.h
index 30520d5..58c79af 100644
--- a/include/scsi/libiscsi_tcp.h
+++ b/include/scsi/libiscsi_tcp.h
@@ -47,7 +47,7 @@ struct iscsi_segment {
struct scatterlist  *sg;
void*sg_mapped;
unsigned intsg_offset;
-   boolatomic_mapped;
+   int sg_map_flags;
 
iscsi_segment_done_fn_t *done;
 };
@@ -92,6 +92,7 @@ enum {
ISCSI_TCP_SKB_DONE, /* skb is out of data */
ISCSI_TCP_CONN_ERR, /* iscsi layer has fired a conn err */
ISCSI_TCP_SUSPENDED,/* conn is suspended */
+   ISCSI_TCP_INTERNAL_ERR, /* an internal error occurred */
 };
 
 extern void iscsi_tcp_hdr_recv_prep(struct iscsi_tcp_conn *tcp_conn);
-- 
2.1.4



[PATCH 10/22] staging: unisys: visorbus: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
Straightforward conversion to the new function.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/staging/unisys/visorhba/visorhba_main.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/unisys/visorhba/visorhba_main.c 
b/drivers/staging/unisys/visorhba/visorhba_main.c
index 0ce92c8..2d8c8bc 100644
--- a/drivers/staging/unisys/visorhba/visorhba_main.c
+++ b/drivers/staging/unisys/visorhba/visorhba_main.c
@@ -842,7 +842,6 @@ do_scsi_nolinuxstat(struct uiscmdrsp *cmdrsp, struct 
scsi_cmnd *scsicmd)
struct scatterlist *sg;
unsigned int i;
char *this_page;
-   char *this_page_orig;
int bufind = 0;
struct visordisk_info *vdisk;
struct visorhba_devdata *devdata;
@@ -869,11 +868,14 @@ do_scsi_nolinuxstat(struct uiscmdrsp *cmdrsp, struct 
scsi_cmnd *scsicmd)
 
sg = scsi_sglist(scsicmd);
for (i = 0; i < scsi_sg_count(scsicmd); i++) {
-   this_page_orig = kmap_atomic(sg_page(sg + i));
-   this_page = (void *)((unsigned long)this_page_orig |
-sg[i].offset);
+   this_page = sg_map(sg + i, SG_KMAP_ATOMIC);
+   if (IS_ERR(this_page)) {
+   scsicmd->result = DID_ERROR << 16;
+   return;
+   }
+
memcpy(this_page, buf + bufind, sg[i].length);
-   kunmap_atomic(this_page_orig);
+   sg_unmap(sg + i, this_page, SG_KMAP_ATOMIC);
}
} else {
devdata = (struct visorhba_devdata *)scsidev->host->hostdata;
-- 
2.1.4



[PATCH 22/22] memstick: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
Straightforward conversion, but we have to WARN if unmappable
memory finds its way into the sgl.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/memstick/host/jmb38x_ms.c | 23 ++-
 drivers/memstick/host/tifm_ms.c   | 22 +-
 2 files changed, 35 insertions(+), 10 deletions(-)

diff --git a/drivers/memstick/host/jmb38x_ms.c 
b/drivers/memstick/host/jmb38x_ms.c
index 48db922..256cf41 100644
--- a/drivers/memstick/host/jmb38x_ms.c
+++ b/drivers/memstick/host/jmb38x_ms.c
@@ -303,7 +303,6 @@ static int jmb38x_ms_transfer_data(struct jmb38x_ms_host 
*host)
unsigned int off;
unsigned int t_size, p_cnt;
unsigned char *buf;
-   struct page *pg;
unsigned long flags = 0;
 
if (host->req->long_data) {
@@ -318,14 +317,26 @@ static int jmb38x_ms_transfer_data(struct jmb38x_ms_host 
*host)
unsigned int uninitialized_var(p_off);
 
if (host->req->long_data) {
-   pg = nth_page(sg_page(>req->sg),
- off >> PAGE_SHIFT);
p_off = offset_in_page(off);
p_cnt = PAGE_SIZE - p_off;
p_cnt = min(p_cnt, length);
 
local_irq_save(flags);
-   buf = kmap_atomic(pg) + p_off;
+   buf = sg_map_offset(>req->sg,
+off - host->req->sg.offset,
+SG_KMAP_ATOMIC);
+   if (IS_ERR(buf)) {
+   /*
+* This should really never happen unless
+* the code is changed to use memory that is
+* not mappable in the sg. Seeing there doesn't
+* seem to be any error path out of here,
+* we can only WARN.
+*/
+   WARN(1, "Non-mappable memory used in sg!");
+   break;
+   }
+
} else {
buf = host->req->data + host->block_pos;
p_cnt = host->req->data_len - host->block_pos;
@@ -341,7 +352,9 @@ static int jmb38x_ms_transfer_data(struct jmb38x_ms_host 
*host)
 : jmb38x_ms_read_reg_data(host, buf, p_cnt);
 
if (host->req->long_data) {
-   kunmap_atomic(buf - p_off);
+   sg_unmap_offset(>req->sg, buf,
+off - host->req->sg.offset,
+SG_KMAP_ATOMIC);
local_irq_restore(flags);
}
 
diff --git a/drivers/memstick/host/tifm_ms.c b/drivers/memstick/host/tifm_ms.c
index 7bafa72..c0bc40e 100644
--- a/drivers/memstick/host/tifm_ms.c
+++ b/drivers/memstick/host/tifm_ms.c
@@ -186,7 +186,6 @@ static unsigned int tifm_ms_transfer_data(struct tifm_ms 
*host)
unsigned int off;
unsigned int t_size, p_cnt;
unsigned char *buf;
-   struct page *pg;
unsigned long flags = 0;
 
if (host->req->long_data) {
@@ -203,14 +202,25 @@ static unsigned int tifm_ms_transfer_data(struct tifm_ms 
*host)
unsigned int uninitialized_var(p_off);
 
if (host->req->long_data) {
-   pg = nth_page(sg_page(>req->sg),
- off >> PAGE_SHIFT);
p_off = offset_in_page(off);
p_cnt = PAGE_SIZE - p_off;
p_cnt = min(p_cnt, length);
 
local_irq_save(flags);
-   buf = kmap_atomic(pg) + p_off;
+   buf = sg_map_offset(>req->sg,
+off - host->req->sg.offset,
+SG_KMAP_ATOMIC);
+   if (IS_ERR(buf)) {
+   /*
+* This should really never happen unless
+* the code is changed to use memory that is
+* not mappable in the sg. Seeing there doesn't
+* seem to be any error path out of here,
+* we can only WARN.
+*/
+   WARN(1, "Non-mappable memory used in sg!");
+   break;
+   }
} else {
buf = host->req->data + host->block_pos;
p_cnt = host->req->data_len - host->block_pos;
@@ -221,7 +231,9 @@ sta

[PATCH 04/22] target: Make use of the new sg_map function at 16 call sites

2017-04-13 Thread Logan Gunthorpe
Fairly straightforward conversions in all spots. In a couple of cases
any error gets propogated up should sg_map fail. In other
cases a warning is issued if the kmap fails seeing there's no
clear error path. This should not be an issue until someone tries to
use unmappable memory in the sgl with this driver.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/target/iscsi/iscsi_target.c|  27 +---
 drivers/target/target_core_rd.c|   3 +-
 drivers/target/target_core_sbc.c   | 122 +++--
 drivers/target/target_core_transport.c |  18 +++--
 drivers/target/target_core_user.c  |  43 
 include/target/target_core_backend.h   |   4 +-
 6 files changed, 149 insertions(+), 68 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target.c 
b/drivers/target/iscsi/iscsi_target.c
index a918024..e3e0d8f 100644
--- a/drivers/target/iscsi/iscsi_target.c
+++ b/drivers/target/iscsi/iscsi_target.c
@@ -579,7 +579,7 @@ iscsit_xmit_nondatain_pdu(struct iscsi_conn *conn, struct 
iscsi_cmd *cmd,
 }
 
 static int iscsit_map_iovec(struct iscsi_cmd *, struct kvec *, u32, u32);
-static void iscsit_unmap_iovec(struct iscsi_cmd *);
+static void iscsit_unmap_iovec(struct iscsi_cmd *, struct kvec *);
 static u32 iscsit_do_crypto_hash_sg(struct ahash_request *, struct iscsi_cmd *,
u32, u32, u32, u8 *);
 static int
@@ -646,7 +646,7 @@ iscsit_xmit_datain_pdu(struct iscsi_conn *conn, struct 
iscsi_cmd *cmd,
 
ret = iscsit_fe_sendpage_sg(cmd, conn);
 
-   iscsit_unmap_iovec(cmd);
+   iscsit_unmap_iovec(cmd, >iov_data[1]);
 
if (ret < 0) {
iscsit_tx_thread_wait_for_tcp(conn);
@@ -925,7 +925,10 @@ static int iscsit_map_iovec(
while (data_length) {
u32 cur_len = min_t(u32, data_length, sg->length - page_off);
 
-   iov[i].iov_base = kmap(sg_page(sg)) + sg->offset + page_off;
+   iov[i].iov_base = sg_map_offset(sg, page_off, SG_KMAP);
+   if (IS_ERR(iov[i].iov_base))
+   goto map_err;
+
iov[i].iov_len = cur_len;
 
data_length -= cur_len;
@@ -937,17 +940,25 @@ static int iscsit_map_iovec(
cmd->kmapped_nents = i;
 
return i;
+
+map_err:
+   cmd->kmapped_nents = i - 1;
+   iscsit_unmap_iovec(cmd, iov);
+   return -1;
 }
 
-static void iscsit_unmap_iovec(struct iscsi_cmd *cmd)
+static void iscsit_unmap_iovec(struct iscsi_cmd *cmd, struct kvec *iov)
 {
u32 i;
struct scatterlist *sg;
+   unsigned int page_off = cmd->first_data_sg_off;
 
sg = cmd->first_data_sg;
 
-   for (i = 0; i < cmd->kmapped_nents; i++)
-   kunmap(sg_page([i]));
+   for (i = 0; i < cmd->kmapped_nents; i++) {
+   sg_unmap_offset([i], iov[i].iov_base, page_off, SG_KMAP);
+   page_off = 0;
+   }
 }
 
 static void iscsit_ack_from_expstatsn(struct iscsi_conn *conn, u32 exp_statsn)
@@ -1610,7 +1621,7 @@ iscsit_get_dataout(struct iscsi_conn *conn, struct 
iscsi_cmd *cmd,
 
rx_got = rx_data(conn, >iov_data[0], iov_count, rx_size);
 
-   iscsit_unmap_iovec(cmd);
+   iscsit_unmap_iovec(cmd, iov);
 
if (rx_got != rx_size)
return -1;
@@ -2626,7 +2637,7 @@ static int iscsit_handle_immediate_data(
 
rx_got = rx_data(conn, >iov_data[0], iov_count, rx_size);
 
-   iscsit_unmap_iovec(cmd);
+   iscsit_unmap_iovec(cmd, cmd->iov_data);
 
if (rx_got != rx_size) {
iscsit_rx_thread_wait_for_tcp(conn);
diff --git a/drivers/target/target_core_rd.c b/drivers/target/target_core_rd.c
index ddc216c..22c5ad5 100644
--- a/drivers/target/target_core_rd.c
+++ b/drivers/target/target_core_rd.c
@@ -431,7 +431,8 @@ static sense_reason_t rd_do_prot_rw(struct se_cmd *cmd, 
bool is_read)
cmd->t_prot_sg, 0);
 
if (!rc)
-   sbc_dif_copy_prot(cmd, sectors, is_read, prot_sg, prot_offset);
+   rc = sbc_dif_copy_prot(cmd, sectors, is_read, prot_sg,
+  prot_offset);
 
return rc;
 }
diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c
index c194063..67cb420 100644
--- a/drivers/target/target_core_sbc.c
+++ b/drivers/target/target_core_sbc.c
@@ -420,17 +420,17 @@ static sense_reason_t xdreadwrite_callback(struct se_cmd 
*cmd, bool success,
 
offset = 0;
for_each_sg(cmd->t_bidi_data_sg, sg, cmd->t_bidi_data_nents, count) {
-   addr = kmap_atomic(sg_page(sg));
-   if (!addr) {
+   addr = sg_map(sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(addr)) {
ret = TCM_OUT_OF_RESOURCES;
goto out;
}
 
for (i = 0; i < sg->length; i++)
-   

[PATCH 18/22] mmc: spi: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
We use the sg_map helper but it's slightly more complicated
as we only check for the error when the mapping actually gets used.
Such that if the mapping failed but wasn't needed then no
error occurs.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/mmc/host/mmc_spi.c | 26 +++---
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/mmc/host/mmc_spi.c b/drivers/mmc/host/mmc_spi.c
index e77d79c..82f786d 100644
--- a/drivers/mmc/host/mmc_spi.c
+++ b/drivers/mmc/host/mmc_spi.c
@@ -676,9 +676,15 @@ mmc_spi_writeblock(struct mmc_spi_host *host, struct 
spi_transfer *t,
struct scratch  *scratch = host->data;
u32 pattern;
 
-   if (host->mmc->use_spi_crc)
+   if (host->mmc->use_spi_crc) {
+   if (IS_ERR(t->tx_buf))
+   return PTR_ERR(t->tx_buf);
+
scratch->crc_val = cpu_to_be16(
crc_itu_t(0, t->tx_buf, t->len));
+   t->tx_buf += t->len;
+   }
+
if (host->dma_dev)
dma_sync_single_for_device(host->dma_dev,
host->data_dma, sizeof(*scratch),
@@ -743,7 +749,6 @@ mmc_spi_writeblock(struct mmc_spi_host *host, struct 
spi_transfer *t,
return status;
}
 
-   t->tx_buf += t->len;
if (host->dma_dev)
t->tx_dma += t->len;
 
@@ -809,6 +814,11 @@ mmc_spi_readblock(struct mmc_spi_host *host, struct 
spi_transfer *t,
}
leftover = status << 1;
 
+   if (bitshift || host->mmc->use_spi_crc) {
+   if (IS_ERR(t->rx_buf))
+   return PTR_ERR(t->rx_buf);
+   }
+
if (host->dma_dev) {
dma_sync_single_for_device(host->dma_dev,
host->data_dma, sizeof(*scratch),
@@ -860,9 +870,10 @@ mmc_spi_readblock(struct mmc_spi_host *host, struct 
spi_transfer *t,
scratch->crc_val, crc, t->len);
return -EILSEQ;
}
+
+   t->rx_buf += t->len;
}
 
-   t->rx_buf += t->len;
if (host->dma_dev)
t->rx_dma += t->len;
 
@@ -936,11 +947,11 @@ mmc_spi_data_do(struct mmc_spi_host *host, struct 
mmc_command *cmd,
}
 
/* allow pio too; we don't allow highmem */
-   kmap_addr = kmap(sg_page(sg));
+   kmap_addr = sg_map(sg, SG_KMAP);
if (direction == DMA_TO_DEVICE)
-   t->tx_buf = kmap_addr + sg->offset;
+   t->tx_buf = kmap_addr;
else
-   t->rx_buf = kmap_addr + sg->offset;
+   t->rx_buf = kmap_addr;
 
/* transfer each block, and update request status */
while (length) {
@@ -970,7 +981,8 @@ mmc_spi_data_do(struct mmc_spi_host *host, struct 
mmc_command *cmd,
/* discard mappings */
if (direction == DMA_FROM_DEVICE)
flush_kernel_dcache_page(sg_page(sg));
-   kunmap(sg_page(sg));
+   if (!IS_ERR(kmap_addr))
+   sg_unmap(sg, kmap_addr, SG_KMAP);
if (dma_dev)
dma_unmap_page(dma_dev, dma_addr, PAGE_SIZE, dir);
 
-- 
2.1.4



[PATCH 19/22] mmc: tmio: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
Straightforward conversion to sg_map helper. A couple paths will
WARN if the memory does not end up being mappable.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/mmc/host/tmio_mmc.h | 12 ++--
 drivers/mmc/host/tmio_mmc_dma.c |  5 +
 drivers/mmc/host/tmio_mmc_pio.c | 24 
 3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/tmio_mmc.h b/drivers/mmc/host/tmio_mmc.h
index 2b349d4..ba68c9fed 100644
--- a/drivers/mmc/host/tmio_mmc.h
+++ b/drivers/mmc/host/tmio_mmc.h
@@ -198,17 +198,25 @@ void tmio_mmc_enable_mmc_irqs(struct tmio_mmc_host *host, 
u32 i);
 void tmio_mmc_disable_mmc_irqs(struct tmio_mmc_host *host, u32 i);
 irqreturn_t tmio_mmc_irq(int irq, void *devid);
 
+/* Note: this function may return PTR_ERR and must be checked! */
 static inline char *tmio_mmc_kmap_atomic(struct scatterlist *sg,
 unsigned long *flags)
 {
+   void *ret;
+
local_irq_save(*flags);
-   return kmap_atomic(sg_page(sg)) + sg->offset;
+   ret = sg_map(sg, SG_KMAP_ATOMIC);
+
+   if (IS_ERR(ret))
+   local_irq_restore(*flags);
+
+   return ret;
 }
 
 static inline void tmio_mmc_kunmap_atomic(struct scatterlist *sg,
  unsigned long *flags, void *virt)
 {
-   kunmap_atomic(virt - sg->offset);
+   sg_unmap(sg, virt, SG_KMAP_ATOMIC);
local_irq_restore(*flags);
 }
 
diff --git a/drivers/mmc/host/tmio_mmc_dma.c b/drivers/mmc/host/tmio_mmc_dma.c
index fa8a936..07531f7 100644
--- a/drivers/mmc/host/tmio_mmc_dma.c
+++ b/drivers/mmc/host/tmio_mmc_dma.c
@@ -149,6 +149,11 @@ static void tmio_mmc_start_dma_tx(struct tmio_mmc_host 
*host)
if (!aligned) {
unsigned long flags;
void *sg_vaddr = tmio_mmc_kmap_atomic(sg, );
+   if (IS_ERR(sg_vaddr)) {
+   ret = PTR_ERR(sg_vaddr);
+   goto pio;
+   }
+
sg_init_one(>bounce_sg, host->bounce_buf, sg->length);
memcpy(host->bounce_buf, sg_vaddr, host->bounce_sg.length);
tmio_mmc_kunmap_atomic(sg, , sg_vaddr);
diff --git a/drivers/mmc/host/tmio_mmc_pio.c b/drivers/mmc/host/tmio_mmc_pio.c
index 6b789a7..d6fdbf6 100644
--- a/drivers/mmc/host/tmio_mmc_pio.c
+++ b/drivers/mmc/host/tmio_mmc_pio.c
@@ -479,6 +479,18 @@ static void tmio_mmc_pio_irq(struct tmio_mmc_host *host)
}
 
sg_virt = tmio_mmc_kmap_atomic(host->sg_ptr, );
+   if (IS_ERR(sg_virt)) {
+   /*
+* This should really never happen unless
+* the code is changed to use memory that is
+* not mappable in the sg. Seeing there doesn't
+* seem to be any error path out of here,
+* we can only WARN.
+*/
+   WARN(1, "Non-mappable memory used in sg!");
+   return;
+   }
+
buf = (unsigned short *)(sg_virt + host->sg_off);
 
count = host->sg_ptr->length - host->sg_off;
@@ -506,6 +518,18 @@ static void tmio_mmc_check_bounce_buffer(struct 
tmio_mmc_host *host)
if (host->sg_ptr == >bounce_sg) {
unsigned long flags;
void *sg_vaddr = tmio_mmc_kmap_atomic(host->sg_orig, );
+   if (IS_ERR(sg_vaddr)) {
+   /*
+* This should really never happen unless
+* the code is changed to use memory that is
+* not mappable in the sg. Seeing there doesn't
+* seem to be any error path out of here,
+* we can only WARN.
+*/
+   WARN(1, "Non-mappable memory used in sg!");
+   return;
+   }
+
memcpy(sg_vaddr, host->bounce_buf, host->bounce_sg.length);
tmio_mmc_kunmap_atomic(host->sg_orig, , sg_vaddr);
}
-- 
2.1.4



[PATCH 16/22] xen-blkfront: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
Straightforward conversion to the new helper, except due to
the lack of error path, we have to warn if unmapable memory
is ever present in the sgl.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/block/xen-blkfront.c | 33 +++--
 1 file changed, 27 insertions(+), 6 deletions(-)

diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c
index 5067a0a..7dcf41d 100644
--- a/drivers/block/xen-blkfront.c
+++ b/drivers/block/xen-blkfront.c
@@ -807,8 +807,19 @@ static int blkif_queue_rw_req(struct request *req, struct 
blkfront_ring_info *ri
BUG_ON(sg->offset + sg->length > PAGE_SIZE);
 
if (setup.need_copy) {
-   setup.bvec_off = sg->offset;
-   setup.bvec_data = kmap_atomic(sg_page(sg));
+   setup.bvec_off = 0;
+   setup.bvec_data = sg_map(sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(setup.bvec_data)) {
+   /*
+* This should really never happen unless
+* the code is changed to use memory that is
+* not mappable in the sg. Seeing there is a
+* questionable error path out of here,
+* we WARN.
+*/
+   WARN(1, "Non-mappable memory used in sg!");
+   return 1;
+   }
}
 
gnttab_foreach_grant_in_range(sg_page(sg),
@@ -818,7 +829,7 @@ static int blkif_queue_rw_req(struct request *req, struct 
blkfront_ring_info *ri
  );
 
if (setup.need_copy)
-   kunmap_atomic(setup.bvec_data);
+   sg_unmap(sg, setup.bvec_data, SG_KMAP_ATOMIC);
}
if (setup.segments)
kunmap_atomic(setup.segments);
@@ -1468,8 +1479,18 @@ static bool blkif_completion(unsigned long *id,
for_each_sg(s->sg, sg, num_sg, i) {
BUG_ON(sg->offset + sg->length > PAGE_SIZE);
 
-   data.bvec_offset = sg->offset;
-   data.bvec_data = kmap_atomic(sg_page(sg));
+   data.bvec_offset = 0;
+   data.bvec_data = sg_map(sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(data.bvec_data)) {
+   /*
+* This should really never happen unless
+* the code is changed to use memory that is
+* not mappable in the sg. Seeing there is no
+* clear error path, we WARN.
+*/
+   WARN(1, "Non-mappable memory used in sg!");
+   return 1;
+   }
 
gnttab_foreach_grant_in_range(sg_page(sg),
  sg->offset,
@@ -1477,7 +1498,7 @@ static bool blkif_completion(unsigned long *id,
  blkif_copy_from_grant,
  );
 
-   kunmap_atomic(data.bvec_data);
+   sg_unmap(sg, data.bvec_data, SG_KMAP_ATOMIC);
}
}
/* Add the persistent grant into the list of free grants */
-- 
2.1.4



[PATCH 05/22] drm/i915: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
This is a single straightforward conversion from kmap to sg_map.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 67b1fc5..1b1b91a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2188,6 +2188,15 @@ static void __i915_gem_object_reset_page_iter(struct 
drm_i915_gem_object *obj)
radix_tree_delete(>mm.get_page.radix, iter.index);
 }
 
+static void i915_gem_object_unmap(const struct drm_i915_gem_object *obj,
+ void *ptr)
+{
+   if (is_vmalloc_addr(ptr))
+   vunmap(ptr);
+   else
+   sg_unmap(obj->mm.pages->sgl, ptr, SG_KMAP);
+}
+
 void __i915_gem_object_put_pages(struct drm_i915_gem_object *obj,
 enum i915_mm_subclass subclass)
 {
@@ -2215,10 +2224,7 @@ void __i915_gem_object_put_pages(struct 
drm_i915_gem_object *obj,
void *ptr;
 
ptr = ptr_mask_bits(obj->mm.mapping);
-   if (is_vmalloc_addr(ptr))
-   vunmap(ptr);
-   else
-   kunmap(kmap_to_page(ptr));
+   i915_gem_object_unmap(obj, ptr);
 
obj->mm.mapping = NULL;
}
@@ -2475,8 +2481,11 @@ static void *i915_gem_object_map(const struct 
drm_i915_gem_object *obj,
void *addr;
 
/* A single page can always be kmapped */
-   if (n_pages == 1 && type == I915_MAP_WB)
-   return kmap(sg_page(sgt->sgl));
+   if (n_pages == 1 && type == I915_MAP_WB) {
+   addr = sg_map(sgt->sgl, SG_KMAP);
+   if (IS_ERR(addr))
+   return NULL;
+   }
 
if (n_pages > ARRAY_SIZE(stack_pages)) {
/* Too big for stack -- allocate temporary array instead */
@@ -2543,11 +2552,7 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object 
*obj,
goto err_unpin;
}
 
-   if (is_vmalloc_addr(ptr))
-   vunmap(ptr);
-   else
-   kunmap(kmap_to_page(ptr));
-
+   i915_gem_object_unmap(obj, ptr);
ptr = obj->mm.mapping = NULL;
}
 
-- 
2.1.4



[PATCH 06/22] crypto: hifn_795x: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
Conversion of a couple kmap_atomic instances to the sg_map helper
function.

However, it looks like there was a bug in the original code: the source
scatter lists offset (t->offset) was passed to ablkcipher_get which
added it to the destination address. This doesn't make a lot of
sense, but t->offset is likely always zero anyway. So, this patch cleans
that brokeness up.

Also, a change to the error path: if ablkcipher_get failed, everything
seemed to proceed as if it hadn't. Setting 'error' should hopefully
clear that up.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/crypto/hifn_795x.c | 32 +---
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/crypto/hifn_795x.c b/drivers/crypto/hifn_795x.c
index e09d405..8e2c6a9 100644
--- a/drivers/crypto/hifn_795x.c
+++ b/drivers/crypto/hifn_795x.c
@@ -1619,7 +1619,7 @@ static int hifn_start_device(struct hifn_device *dev)
return 0;
 }
 
-static int ablkcipher_get(void *saddr, unsigned int *srestp, unsigned int 
offset,
+static int ablkcipher_get(void *saddr, unsigned int *srestp,
struct scatterlist *dst, unsigned int size, unsigned int 
*nbytesp)
 {
unsigned int srest = *srestp, nbytes = *nbytesp, copy;
@@ -1632,15 +1632,17 @@ static int ablkcipher_get(void *saddr, unsigned int 
*srestp, unsigned int offset
while (size) {
copy = min3(srest, dst->length, size);
 
-   daddr = kmap_atomic(sg_page(dst));
-   memcpy(daddr + dst->offset + offset, saddr, copy);
-   kunmap_atomic(daddr);
+   daddr = sg_map(dst, SG_KMAP_ATOMIC);
+   if (IS_ERR(daddr))
+   return PTR_ERR(daddr);
+
+   memcpy(daddr, saddr, copy);
+   sg_unmap(dst, daddr, SG_KMAP_ATOMIC);
 
nbytes -= copy;
size -= copy;
srest -= copy;
saddr += copy;
-   offset = 0;
 
pr_debug("%s: copy: %u, size: %u, srest: %u, nbytes: %u.\n",
 __func__, copy, size, srest, nbytes);
@@ -1671,11 +1673,12 @@ static inline void hifn_complete_sa(struct hifn_device 
*dev, int i)
 
 static void hifn_process_ready(struct ablkcipher_request *req, int error)
 {
+   int err;
struct hifn_request_context *rctx = ablkcipher_request_ctx(req);
 
if (rctx->walk.flags & ASYNC_FLAGS_MISALIGNED) {
unsigned int nbytes = req->nbytes;
-   int idx = 0, err;
+   int idx = 0;
struct scatterlist *dst, *t;
void *saddr;
 
@@ -1695,17 +1698,24 @@ static void hifn_process_ready(struct 
ablkcipher_request *req, int error)
continue;
}
 
-   saddr = kmap_atomic(sg_page(t));
+   saddr = sg_map(t, SG_KMAP_ATOMIC);
+   if (IS_ERR(saddr)) {
+   if (!error)
+   error = PTR_ERR(saddr);
+   break;
+   }
+
+   err = ablkcipher_get(saddr, >length,
+dst, nbytes, );
+   sg_unmap(t, saddr, SG_KMAP_ATOMIC);
 
-   err = ablkcipher_get(saddr, >length, t->offset,
-   dst, nbytes, );
if (err < 0) {
-   kunmap_atomic(saddr);
+   if (!error)
+   error = err;
break;
}
 
idx += err;
-   kunmap_atomic(saddr);
}
 
hifn_cipher_walk_exit(>walk);
-- 
2.1.4



[PATCH 00/22] Introduce common scatterlist map function

2017-04-13 Thread Logan Gunthorpe
Hi Everyone,

As part of my effort to enable P2P DMA transactions with PCI cards,
we've identified the need to be able to safely put IO memory into
scatterlists (and eventually other spots). This probably involves a
conversion from struct page to pfn_t but that migration is a ways off
and those decisions are yet to be made.

As an initial step in that direction, I've started cleaning up some of the
scatterlist code by trying to carve out a better defined layer between it
and it's users. The longer term goal would be to remove sg_page or replace
it with something that can potentially fail.

This patchset is the first step in that effort. I've introduced
a common function to map scatterlist memory and converted all the common
kmap(sg_page()) cases. This removes about 66 sg_page calls (of ~331).

Seeing this is a fairly large cleanup set that touches a wide swath of
the kernel I have limited the people I've sent this to. I'd suggest we look
toward merging the first patch and then I can send the individual subsystem
patches on to their respective maintainers and get them merged
independantly. (This is to avoid the conflicts I created with my last
cleanup set... Sorry) Though, I'm certainly open to other suggestions to get
it merged.

The patchset is based on v4.11-rc6 and can be found in the sg_map
branch from this git tree:

https://github.com/sbates130272/linux-p2pmem.git

Thanks,

Logan


Logan Gunthorpe (22):
  scatterlist: Introduce sg_map helper functions
  nvmet: Make use of the new sg_map helper function
  libiscsi: Make use of new the sg_map helper function
  target: Make use of the new sg_map function at 16 call sites
  drm/i915: Make use of the new sg_map helper function
  crypto: hifn_795x: Make use of the new sg_map helper function
  crypto: shash, caam: Make use of the new sg_map helper function
  crypto: chcr: Make use of the new sg_map helper function
  dm-crypt: Make use of the new sg_map helper in 4 call sites
  staging: unisys: visorbus: Make use of the new sg_map helper function
  RDS: Make use of the new sg_map helper function
  scsi: ipr, pmcraid, isci: Make use of the new sg_map helper in 4 call
sites
  scsi: hisi_sas, mvsas, gdth: Make use of the new sg_map helper
function
  scsi: arcmsr, ips, megaraid: Make use of the new sg_map helper
function
  scsi: libfc, csiostor: Change to sg_copy_buffer in two drivers
  xen-blkfront: Make use of the new sg_map helper function
  mmc: sdhci: Make use of the new sg_map helper function
  mmc: spi: Make use of the new sg_map helper function
  mmc: tmio: Make use of the new sg_map helper function
  mmc: sdricoh_cs: Make use of the new sg_map helper function
  mmc: tifm_sd: Make use of the new sg_map helper function
  memstick: Make use of the new sg_map helper function

 crypto/shash.c  |   9 +-
 drivers/block/xen-blkfront.c|  33 +--
 drivers/crypto/caam/caamalg.c   |   8 +-
 drivers/crypto/chelsio/chcr_algo.c  |  28 +++---
 drivers/crypto/hifn_795x.c  |  32 ---
 drivers/dma-buf/dma-buf.c   |   3 +
 drivers/gpu/drm/i915/i915_gem.c |  27 +++---
 drivers/md/dm-crypt.c   |  38 +---
 drivers/memstick/host/jmb38x_ms.c   |  23 -
 drivers/memstick/host/tifm_ms.c |  22 -
 drivers/mmc/host/mmc_spi.c  |  26 +++--
 drivers/mmc/host/sdhci.c|  35 ++-
 drivers/mmc/host/sdricoh_cs.c   |  14 ++-
 drivers/mmc/host/tifm_sd.c  |  88 +
 drivers/mmc/host/tmio_mmc.h |  12 ++-
 drivers/mmc/host/tmio_mmc_dma.c |   5 +
 drivers/mmc/host/tmio_mmc_pio.c |  24 +
 drivers/nvme/target/fabrics-cmd.c   |  16 +++-
 drivers/scsi/arcmsr/arcmsr_hba.c|  16 +++-
 drivers/scsi/csiostor/csio_scsi.c   |  54 +--
 drivers/scsi/cxgbi/libcxgbi.c   |   5 +
 drivers/scsi/gdth.c |   9 +-
 drivers/scsi/hisi_sas/hisi_sas_v1_hw.c  |  14 ++-
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c  |  13 ++-
 drivers/scsi/ipr.c  |  27 +++---
 drivers/scsi/ips.c  |   8 +-
 drivers/scsi/isci/request.c |  42 
 drivers/scsi/libfc/fc_libfc.c   |  49 ++
 drivers/scsi/libiscsi_tcp.c |  32 ---
 drivers/scsi/megaraid.c |   9 +-
 drivers/scsi/mvsas/mv_sas.c |  10 +-
 drivers/scsi/pmcraid.c  |  19 ++--
 drivers/staging/unisys/visorhba/visorhba_main.c |  12 ++-
 drivers/target/iscsi/iscsi_target.c |  27 --
 drivers/target/target_core_rd.c |   3 +-
 drivers/target/target_core_sbc.c| 122

[PATCH 08/22] crypto: chcr: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
The get_page in this area looks *highly* suspect due to there being no
corresponding put_page. However, I've left that as is to avoid breaking
things.

I've also removed the KMAP_ATOMIC_ARGS check as it appears to be dead
code that dates back to when it was first committed...

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/crypto/chelsio/chcr_algo.c | 28 +++-
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/drivers/crypto/chelsio/chcr_algo.c 
b/drivers/crypto/chelsio/chcr_algo.c
index 41bc7f4..a993d1d 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -1489,22 +1489,21 @@ static struct sk_buff *create_authenc_wr(struct 
aead_request *req,
return ERR_PTR(-EINVAL);
 }
 
-static void aes_gcm_empty_pld_pad(struct scatterlist *sg,
- unsigned short offset)
+static int aes_gcm_empty_pld_pad(struct scatterlist *sg,
+unsigned short offset)
 {
-   struct page *spage;
unsigned char *addr;
 
-   spage = sg_page(sg);
-   get_page(spage); /* so that it is not freed by NIC */
-#ifdef KMAP_ATOMIC_ARGS
-   addr = kmap_atomic(spage, KM_SOFTIRQ0);
-#else
-   addr = kmap_atomic(spage);
-#endif
-   memset(addr + sg->offset, 0, offset + 1);
+   get_page(sg_page(sg)); /* so that it is not freed by NIC */
+
+   addr = sg_map(sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(addr))
+   return PTR_ERR(addr);
+
+   memset(addr, 0, offset + 1);
+   sg_unmap(sg, addr, SG_KMAP_ATOMIC);
 
-   kunmap_atomic(addr);
+   return 0;
 }
 
 static int set_msg_len(u8 *block, unsigned int msglen, int csize)
@@ -1940,7 +1939,10 @@ static struct sk_buff *create_gcm_wr(struct aead_request 
*req,
if (req->cryptlen) {
write_sg_to_skb(skb, , src, req->cryptlen);
} else {
-   aes_gcm_empty_pld_pad(req->dst, authsize - 1);
+   err = aes_gcm_empty_pld_pad(req->dst, authsize - 1);
+   if (err)
+   goto dstmap_fail;
+
write_sg_to_skb(skb, , reqctx->dst, crypt_len);
 
}
-- 
2.1.4



[PATCH 01/22] scatterlist: Introduce sg_map helper functions

2017-04-13 Thread Logan Gunthorpe
This patch introduces functions which kmap the pages inside an sgl. Two
variants are provided: one if an offset is required and one if the
offset is zero. These functions replace a common pattern of
kmap(sg_page(sg)) that is used in about 50 places within the kernel.

The motivation for this work is to eventually safely support sgls that
contain io memory. In order for that to work, any access to the contents
of an iomem SGL will need to be done with iomemcpy or hit some warning.
(The exact details of how this will work have yet to be worked out.)
Having all the kmaps in one place is just a first step in that
direction. Additionally, seeing this helps cut down the users of sg_page,
it should make any effort to go to struct-page-less DMAs a little
easier (should that idea ever swing back into favour again).

A flags option is added to select between a regular or atomic mapping so
these functions can replace kmap(sg_page or kmap_atomic(sg_page.
Future work may expand this to have flags for using page_address or
vmap. Much further in the future, there may be a flag to allocate memory
and copy the data from/to iomem.

We also add the semantic that sg_map can fail to create a mapping,
despite the fact that the current code this is replacing is assumed to
never fail and the current version of these functions cannot fail. This
is to support iomem which either have to fail to create the mapping or
allocate memory as a bounce buffer which itself can fail.

Also, in terms of cleanup, a few of the existing kmap(sg_page) users
play things a bit loose in terms of whether they apply sg->offset
so using these helper functions should help avoid such issues.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/dma-buf/dma-buf.c   |  3 ++
 include/linux/scatterlist.h | 97 +
 2 files changed, 100 insertions(+)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 0007b79..b95934b 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -37,6 +37,9 @@
 
 #include 
 
+/* Prevent the highmem.h macro from aliasing ops->kunmap_atomic */
+#undef kunmap_atomic
+
 static inline int is_dma_buf_file(struct file *);
 
 struct dma_buf_list {
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index cb3c8fe..acd4d73 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 struct scatterlist {
@@ -126,6 +127,102 @@ static inline struct page *sg_page(struct scatterlist *sg)
return (struct page *)((sg)->page_link & ~0x3);
 }
 
+#define SG_KMAP(1 << 0)/* create a mapping with kmap */
+#define SG_KMAP_ATOMIC (1 << 1)/* create a mapping with kmap_atomic */
+
+/**
+ * sg_map_offset - kmap a page inside an sgl
+ * @sg:SG entry
+ * @offset:Offset into entry
+ * @flags: Flags for creating the mapping
+ *
+ * Description:
+ *   Use this function to map a page in the scatterlist at the specified
+ *   offset. sg->offset is already added for you. Note: the semantics of
+ *   this function are that it may fail. Thus, its output should be checked
+ *   with IS_ERR and PTR_ERR. Otherwise, a pointer to the specified offset
+ *   in the mapped page is returned.
+ *
+ *   Flags can be any of:
+ * * SG_KMAP- Use kmap to create the mapping
+ * * SG_KMAP_ATOMIC - Use kmap_atomic to map the page atommically.
+ *Thus, the rules of that function apply: the cpu
+ *may not sleep until it is unmaped.
+ *
+ *   Also, consider carefully whether this function is appropriate. It is
+ *   largely not recommended for new code and if the sgl came from another
+ *   subsystem and you don't know what kind of memory might be in the list
+ *   then you definitely should not call it. Non-mappable memory may be in
+ *   the sgl and thus this function may fail unexpectedly.
+ **/
+static inline void *sg_map_offset(struct scatterlist *sg, size_t offset,
+  int flags)
+{
+   struct page *pg;
+   unsigned int pg_off;
+
+   offset += sg->offset;
+   pg = nth_page(sg_page(sg), offset >> PAGE_SHIFT);
+   pg_off = offset_in_page(offset);
+
+   if (flags & SG_KMAP_ATOMIC)
+   return kmap_atomic(pg) + pg_off;
+   else
+   return kmap(pg) + pg_off;
+}
+
+/**
+ * sg_unkmap_offset - unmap a page that was mapped with sg_map_offset
+ * @sg:SG entry
+ * @addr:  address returned by sg_map_offset
+ * @offset:Offset into entry (same as specified for sg_map_offset)
+ * @flags: Flags, which are the same specified for sg_map_offset
+ *
+ * Description:
+ *   Unmap the page that was mapped with sg_map_offset
+ *
+ **/
+static inline void sg_unmap_offset(struct scatterlist *sg, void *addr,
+

[PATCH 17/22] mmc: sdhci: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
Straightforward conversion, except due to the lack of error path we
have to WARN if the memory in the SGL is not mappable.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/mmc/host/sdhci.c | 35 ++-
 1 file changed, 30 insertions(+), 5 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 63bc33a..af0c107 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -497,15 +497,34 @@ static int sdhci_pre_dma_transfer(struct sdhci_host *host,
return sg_count;
 }
 
+/*
+ * Note this function may return PTR_ERR and must be checked.
+ */
 static char *sdhci_kmap_atomic(struct scatterlist *sg, unsigned long *flags)
 {
+   void *ret;
+
local_irq_save(*flags);
-   return kmap_atomic(sg_page(sg)) + sg->offset;
+
+   ret = sg_map(sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(ret)) {
+   /*
+* This should really never happen unless the code is changed
+* to use memory that is not mappable in the sg. Seeing there
+* doesn't seem to be any error path out of here, we can only
+* WARN.
+*/
+   WARN(1, "Non-mappable memory used in sg!");
+   local_irq_restore(*flags);
+   }
+
+   return ret;
 }
 
-static void sdhci_kunmap_atomic(void *buffer, unsigned long *flags)
+static void sdhci_kunmap_atomic(struct scatterlist *sg, void *buffer,
+   unsigned long *flags)
 {
-   kunmap_atomic(buffer);
+   sg_unmap(sg, buffer, SG_KMAP_ATOMIC);
local_irq_restore(*flags);
 }
 
@@ -568,8 +587,11 @@ static void sdhci_adma_table_pre(struct sdhci_host *host,
if (offset) {
if (data->flags & MMC_DATA_WRITE) {
buffer = sdhci_kmap_atomic(sg, );
+   if (IS_ERR(buffer))
+   return;
+
memcpy(align, buffer, offset);
-   sdhci_kunmap_atomic(buffer, );
+   sdhci_kunmap_atomic(sg, buffer, );
}
 
/* tran, valid */
@@ -646,8 +668,11 @@ static void sdhci_adma_table_post(struct sdhci_host *host,
   (sg_dma_address(sg) & 
SDHCI_ADMA2_MASK);
 
buffer = sdhci_kmap_atomic(sg, );
+   if (IS_ERR(buffer))
+   return;
+
memcpy(buffer, align, size);
-   sdhci_kunmap_atomic(buffer, );
+   sdhci_kunmap_atomic(sg, buffer, );
 
align += SDHCI_ADMA2_ALIGN;
}
-- 
2.1.4



[PATCH 20/22] mmc: sdricoh_cs: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
This is a straightforward conversion to the new function.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/mmc/host/sdricoh_cs.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/drivers/mmc/host/sdricoh_cs.c b/drivers/mmc/host/sdricoh_cs.c
index 5ff26ab..7eeed23 100644
--- a/drivers/mmc/host/sdricoh_cs.c
+++ b/drivers/mmc/host/sdricoh_cs.c
@@ -319,16 +319,20 @@ static void sdricoh_request(struct mmc_host *mmc, struct 
mmc_request *mrq)
for (i = 0; i < data->blocks; i++) {
size_t len = data->blksz;
u8 *buf;
-   struct page *page;
int result;
-   page = sg_page(data->sg);
 
-   buf = kmap(page) + data->sg->offset + (len * i);
+   buf = sg_map_offset(data->sg, (len * i), SG_KMAP);
+   if (IS_ERR(buf)) {
+   cmd->error = PTR_ERR(buf);
+   break;
+   }
+
result =
sdricoh_blockio(host,
data->flags & MMC_DATA_READ, buf, len);
-   kunmap(page);
-   flush_dcache_page(page);
+   sg_unmap_offset(data->sg, buf, (len * i), SG_KMAP);
+
+   flush_dcache_page(sg_page(data->sg));
if (result) {
dev_err(dev, "sdricoh_request: cmd %i "
"block transfer failed\n", cmd->opcode);
-- 
2.1.4



[PATCH 21/22] mmc: tifm_sd: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
This conversion is a bit complicated. We modiy the read_fifo,
write_fifo and copy_page functions to take a scatterlist instead of a
page. Thus we can use sg_map instead of kmap_atomic. There's a bit of
accounting that needed to be done for the offset for this to work.
(Seeing sg_map takes care of the offset but it's already added and
used earlier in the code.

There's also no error path, so if unmappable memory finds its way into
the sgl we can only WARN.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/mmc/host/tifm_sd.c | 88 +++---
 1 file changed, 67 insertions(+), 21 deletions(-)

diff --git a/drivers/mmc/host/tifm_sd.c b/drivers/mmc/host/tifm_sd.c
index 93c4b40..75b0d74 100644
--- a/drivers/mmc/host/tifm_sd.c
+++ b/drivers/mmc/host/tifm_sd.c
@@ -111,14 +111,26 @@ struct tifm_sd {
 };
 
 /* for some reason, host won't respond correctly to readw/writew */
-static void tifm_sd_read_fifo(struct tifm_sd *host, struct page *pg,
+static void tifm_sd_read_fifo(struct tifm_sd *host, struct scatterlist *sg,
  unsigned int off, unsigned int cnt)
 {
struct tifm_dev *sock = host->dev;
unsigned char *buf;
unsigned int pos = 0, val;
 
-   buf = kmap_atomic(pg) + off;
+   buf = sg_map_offset(sg, off - sg->offset, SG_KMAP_ATOMIC);
+   if (IS_ERR(buf)) {
+   /*
+* This should really never happen unless
+* the code is changed to use memory that is
+* not mappable in the sg. Seeing there doesn't
+* seem to be any error path out of here,
+* we can only WARN.
+*/
+   WARN(1, "Non-mappable memory used in sg!");
+   return;
+   }
+
if (host->cmd_flags & DATA_CARRY) {
buf[pos++] = host->bounce_buf_data[0];
host->cmd_flags &= ~DATA_CARRY;
@@ -134,17 +146,29 @@ static void tifm_sd_read_fifo(struct tifm_sd *host, 
struct page *pg,
}
buf[pos++] = (val >> 8) & 0xff;
}
-   kunmap_atomic(buf - off);
+   sg_unmap_offset(sg, buf, off - sg->offset, SG_KMAP_ATOMIC);
 }
 
-static void tifm_sd_write_fifo(struct tifm_sd *host, struct page *pg,
+static void tifm_sd_write_fifo(struct tifm_sd *host, struct scatterlist *sg,
   unsigned int off, unsigned int cnt)
 {
struct tifm_dev *sock = host->dev;
unsigned char *buf;
unsigned int pos = 0, val;
 
-   buf = kmap_atomic(pg) + off;
+   buf = sg_map_offset(sg, off - sg->offset, SG_KMAP_ATOMIC);
+   if (IS_ERR(buf)) {
+   /*
+* This should really never happen unless
+* the code is changed to use memory that is
+* not mappable in the sg. Seeing there doesn't
+* seem to be any error path out of here,
+* we can only WARN.
+*/
+   WARN(1, "Non-mappable memory used in sg!");
+   return;
+   }
+
if (host->cmd_flags & DATA_CARRY) {
val = host->bounce_buf_data[0] | ((buf[pos++] << 8) & 0xff00);
writel(val, sock->addr + SOCK_MMCSD_DATA);
@@ -161,7 +185,7 @@ static void tifm_sd_write_fifo(struct tifm_sd *host, struct 
page *pg,
val |= (buf[pos++] << 8) & 0xff00;
writel(val, sock->addr + SOCK_MMCSD_DATA);
}
-   kunmap_atomic(buf - off);
+   sg_unmap_offset(sg, buf, off - sg->offset, SG_KMAP_ATOMIC);
 }
 
 static void tifm_sd_transfer_data(struct tifm_sd *host)
@@ -170,7 +194,6 @@ static void tifm_sd_transfer_data(struct tifm_sd *host)
struct scatterlist *sg = r_data->sg;
unsigned int off, cnt, t_size = TIFM_MMCSD_FIFO_SIZE * 2;
unsigned int p_off, p_cnt;
-   struct page *pg;
 
if (host->sg_pos == host->sg_len)
return;
@@ -192,33 +215,57 @@ static void tifm_sd_transfer_data(struct tifm_sd *host)
}
off = sg[host->sg_pos].offset + host->block_pos;
 
-   pg = nth_page(sg_page([host->sg_pos]), off >> PAGE_SHIFT);
p_off = offset_in_page(off);
p_cnt = PAGE_SIZE - p_off;
p_cnt = min(p_cnt, cnt);
p_cnt = min(p_cnt, t_size);
 
if (r_data->flags & MMC_DATA_READ)
-   tifm_sd_read_fifo(host, pg, p_off, p_cnt);
+   tifm_sd_read_fifo(host, [host->sg_pos], p_off,
+ p_cnt);
else if (r_data->flags & MMC_DATA_WRITE)
-   tifm_sd_write_fifo(host, pg, p_off, p_cnt);
+   tifm_sd_write_fifo(host, [host->sg_pos], p_off,
+ 

[PATCH 09/22] dm-crypt: Make use of the new sg_map helper in 4 call sites

2017-04-13 Thread Logan Gunthorpe
Very straightforward conversion to the new function in all four spots.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/md/dm-crypt.c | 38 +-
 1 file changed, 25 insertions(+), 13 deletions(-)

diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 389a363..6bd0ffc 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -589,9 +589,12 @@ static int crypt_iv_lmk_gen(struct crypt_config *cc, u8 
*iv,
int r = 0;
 
if (bio_data_dir(dmreq->ctx->bio_in) == WRITE) {
-   src = kmap_atomic(sg_page(>sg_in));
-   r = crypt_iv_lmk_one(cc, iv, dmreq, src + dmreq->sg_in.offset);
-   kunmap_atomic(src);
+   src = sg_map(>sg_in, SG_KMAP_ATOMIC);
+   if (IS_ERR(src))
+   return PTR_ERR(src);
+
+   r = crypt_iv_lmk_one(cc, iv, dmreq, src);
+   sg_unmap(>sg_in, src, SG_KMAP_ATOMIC);
} else
memset(iv, 0, cc->iv_size);
 
@@ -607,14 +610,17 @@ static int crypt_iv_lmk_post(struct crypt_config *cc, u8 
*iv,
if (bio_data_dir(dmreq->ctx->bio_in) == WRITE)
return 0;
 
-   dst = kmap_atomic(sg_page(>sg_out));
-   r = crypt_iv_lmk_one(cc, iv, dmreq, dst + dmreq->sg_out.offset);
+   dst = sg_map(>sg_out, SG_KMAP_ATOMIC);
+   if (IS_ERR(dst))
+   return PTR_ERR(dst);
+
+   r = crypt_iv_lmk_one(cc, iv, dmreq, dst);
 
/* Tweak the first block of plaintext sector */
if (!r)
-   crypto_xor(dst + dmreq->sg_out.offset, iv, cc->iv_size);
+   crypto_xor(dst, iv, cc->iv_size);
 
-   kunmap_atomic(dst);
+   sg_unmap(>sg_out, dst, SG_KMAP_ATOMIC);
return r;
 }
 
@@ -731,9 +737,12 @@ static int crypt_iv_tcw_gen(struct crypt_config *cc, u8 
*iv,
 
/* Remove whitening from ciphertext */
if (bio_data_dir(dmreq->ctx->bio_in) != WRITE) {
-   src = kmap_atomic(sg_page(>sg_in));
-   r = crypt_iv_tcw_whitening(cc, dmreq, src + 
dmreq->sg_in.offset);
-   kunmap_atomic(src);
+   src = sg_map(>sg_in, SG_KMAP_ATOMIC);
+   if (IS_ERR(src))
+   return PTR_ERR(src);
+
+   r = crypt_iv_tcw_whitening(cc, dmreq, src);
+   sg_unmap(>sg_in, src, SG_KMAP_ATOMIC);
}
 
/* Calculate IV */
@@ -755,9 +764,12 @@ static int crypt_iv_tcw_post(struct crypt_config *cc, u8 
*iv,
return 0;
 
/* Apply whitening on ciphertext */
-   dst = kmap_atomic(sg_page(>sg_out));
-   r = crypt_iv_tcw_whitening(cc, dmreq, dst + dmreq->sg_out.offset);
-   kunmap_atomic(dst);
+   dst = sg_map(>sg_out, SG_KMAP_ATOMIC);
+   if (IS_ERR(dst))
+   return PTR_ERR(dst);
+
+   r = crypt_iv_tcw_whitening(cc, dmreq, dst);
+   sg_unmap(>sg_out, dst, SG_KMAP_ATOMIC);
 
return r;
 }
-- 
2.1.4



[PATCH 02/22] nvmet: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
This is a straight forward conversion in two places. Should kmap fail,
the code will return an INVALD_DATA error in the completion.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/nvme/target/fabrics-cmd.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/nvme/target/fabrics-cmd.c 
b/drivers/nvme/target/fabrics-cmd.c
index 8bd022af..f62a634 100644
--- a/drivers/nvme/target/fabrics-cmd.c
+++ b/drivers/nvme/target/fabrics-cmd.c
@@ -122,7 +122,11 @@ static void nvmet_execute_admin_connect(struct nvmet_req 
*req)
struct nvmet_ctrl *ctrl = NULL;
u16 status = 0;
 
-   d = kmap(sg_page(req->sg)) + req->sg->offset;
+   d = sg_map(req->sg, SG_KMAP);
+   if (IS_ERR(d)) {
+   status = NVME_SC_SGL_INVALID_DATA;
+   goto out;
+   }
 
/* zero out initial completion result, assign values as needed */
req->rsp->result.u32 = 0;
@@ -158,7 +162,7 @@ static void nvmet_execute_admin_connect(struct nvmet_req 
*req)
req->rsp->result.u16 = cpu_to_le16(ctrl->cntlid);
 
 out:
-   kunmap(sg_page(req->sg));
+   sg_unmap(req->sg, d, SG_KMAP);
nvmet_req_complete(req, status);
 }
 
@@ -170,7 +174,11 @@ static void nvmet_execute_io_connect(struct nvmet_req *req)
u16 qid = le16_to_cpu(c->qid);
u16 status = 0;
 
-   d = kmap(sg_page(req->sg)) + req->sg->offset;
+   d = sg_map(req->sg, SG_KMAP);
+   if (IS_ERR(d)) {
+   status = NVME_SC_SGL_INVALID_DATA;
+   goto out;
+   }
 
/* zero out initial completion result, assign values as needed */
req->rsp->result.u32 = 0;
@@ -205,7 +213,7 @@ static void nvmet_execute_io_connect(struct nvmet_req *req)
pr_info("adding queue %d to ctrl %d.\n", qid, ctrl->cntlid);
 
 out:
-   kunmap(sg_page(req->sg));
+   sg_unmap(req->sg, d, SG_KMAP);
nvmet_req_complete(req, status);
return;
 
-- 
2.1.4



[PATCH 13/22] scsi: hisi_sas, mvsas, gdth: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
Very straightforward conversion of three scsi drivers.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/scsi/gdth.c|  9 +++--
 drivers/scsi/hisi_sas/hisi_sas_v1_hw.c | 14 +-
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c | 13 +
 drivers/scsi/mvsas/mv_sas.c| 10 +-
 4 files changed, 30 insertions(+), 16 deletions(-)

diff --git a/drivers/scsi/gdth.c b/drivers/scsi/gdth.c
index d020a13..82c9fba 100644
--- a/drivers/scsi/gdth.c
+++ b/drivers/scsi/gdth.c
@@ -2301,10 +2301,15 @@ static void gdth_copy_internal_data(gdth_ha_str *ha, 
Scsi_Cmnd *scp,
 return;
 }
 local_irq_save(flags);
-address = kmap_atomic(sg_page(sl)) + sl->offset;
+address = sg_map(sl, SG_KMAP_ATOMIC);
+if (IS_ERR(address)) {
+scp->result = DID_ERROR << 16;
+return;
+   }
+
 memcpy(address, buffer, cpnow);
 flush_dcache_page(sg_page(sl));
-kunmap_atomic(address);
+sg_unmap(sl, address, SG_KMAP_ATOMIC);
 local_irq_restore(flags);
 if (cpsum == cpcount)
 break;
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
index 854fbea..30408f8 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v1_hw.c
@@ -1377,18 +1377,22 @@ static int slot_complete_v1_hw(struct hisi_hba 
*hisi_hba,
void *to;
struct scatterlist *sg_resp = >smp_task.smp_resp;
 
-   ts->stat = SAM_STAT_GOOD;
-   to = kmap_atomic(sg_page(sg_resp));
+   to = sg_map(sg_resp, SG_KMAP_ATOMIC);
+   if (IS_ERR(to)) {
+   dev_err(dev, "slot complete: error mapping memory");
+   ts->stat = SAS_SG_ERR;
+   break;
+   }
 
+   ts->stat = SAM_STAT_GOOD;
dma_unmap_sg(dev, >smp_task.smp_resp, 1,
 DMA_FROM_DEVICE);
dma_unmap_sg(dev, >smp_task.smp_req, 1,
 DMA_TO_DEVICE);
-   memcpy(to + sg_resp->offset,
-  slot->status_buffer +
+   memcpy(to, slot->status_buffer +
   sizeof(struct hisi_sas_err_record),
   sg_dma_len(sg_resp));
-   kunmap_atomic(to);
+   sg_unmap(sg_resp, to, SG_KMAP_ATOMIC);
break;
}
case SAS_PROTOCOL_SATA:
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c 
b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
index 1b21445..0907947 100644
--- a/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
+++ b/drivers/scsi/hisi_sas/hisi_sas_v2_hw.c
@@ -1796,18 +1796,23 @@ slot_complete_v2_hw(struct hisi_hba *hisi_hba, struct 
hisi_sas_slot *slot,
struct scatterlist *sg_resp = >smp_task.smp_resp;
void *to;
 
+   to = sg_map(sg_resp, SG_KMAP_ATOMIC);
+   if (IS_ERR(to)) {
+   dev_err(dev, "slot complete: error mapping memory");
+   ts->stat = SAS_SG_ERR;
+   break;
+   }
+
ts->stat = SAM_STAT_GOOD;
-   to = kmap_atomic(sg_page(sg_resp));
 
dma_unmap_sg(dev, >smp_task.smp_resp, 1,
 DMA_FROM_DEVICE);
dma_unmap_sg(dev, >smp_task.smp_req, 1,
 DMA_TO_DEVICE);
-   memcpy(to + sg_resp->offset,
-  slot->status_buffer +
+   memcpy(to, slot->status_buffer +
   sizeof(struct hisi_sas_err_record),
   sg_dma_len(sg_resp));
-   kunmap_atomic(to);
+   sg_unmap(sg_resp, to, SG_KMAP_ATOMIC);
break;
}
case SAS_PROTOCOL_SATA:
diff --git a/drivers/scsi/mvsas/mv_sas.c b/drivers/scsi/mvsas/mv_sas.c
index c7cc803..374d0e0 100644
--- a/drivers/scsi/mvsas/mv_sas.c
+++ b/drivers/scsi/mvsas/mv_sas.c
@@ -1798,11 +1798,11 @@ int mvs_slot_complete(struct mvs_info *mvi, u32 
rx_desc, u32 flags)
case SAS_PROTOCOL_SMP: {
struct scatterlist *sg_resp = >smp_task.smp_resp;
tstat->stat = SAM_STAT_GOOD;
-   to = kmap_atomic(sg_page(sg_resp));
-   memcpy(to + sg_resp->offset,
-   slot->response + sizeof(struct mvs_err_info),
-   sg_dma_len(sg_resp));
-   kunmap_atomic(to);
+   to = sg_map(sg_resp, SG_KMAP_ATOMIC);
+   memcpy(to,
+  slot->response + sizeof(struct mvs_err_info),
+

[PATCH 14/22] scsi: arcmsr, ips, megaraid: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
Very straightforward conversion of three scsi drivers

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 drivers/scsi/arcmsr/arcmsr_hba.c | 16 
 drivers/scsi/ips.c   |  8 
 drivers/scsi/megaraid.c  |  9 +++--
 3 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/scsi/arcmsr/arcmsr_hba.c b/drivers/scsi/arcmsr/arcmsr_hba.c
index af032c4..3cd485c 100644
--- a/drivers/scsi/arcmsr/arcmsr_hba.c
+++ b/drivers/scsi/arcmsr/arcmsr_hba.c
@@ -2306,7 +2306,10 @@ static int arcmsr_iop_message_xfer(struct 
AdapterControlBlock *acb,
 
use_sg = scsi_sg_count(cmd);
sg = scsi_sglist(cmd);
-   buffer = kmap_atomic(sg_page(sg)) + sg->offset;
+   buffer = sg_map(sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(buffer))
+   return ARCMSR_MESSAGE_FAIL;
+
if (use_sg > 1) {
retvalue = ARCMSR_MESSAGE_FAIL;
goto message_out;
@@ -2539,7 +2542,7 @@ static int arcmsr_iop_message_xfer(struct 
AdapterControlBlock *acb,
 message_out:
if (use_sg) {
struct scatterlist *sg = scsi_sglist(cmd);
-   kunmap_atomic(buffer - sg->offset);
+   sg_unmap(sg, buffer, SG_KMAP_ATOMIC);
}
return retvalue;
 }
@@ -2590,11 +2593,16 @@ static void arcmsr_handle_virtual_command(struct 
AdapterControlBlock *acb,
strncpy([32], "R001", 4); /* Product Revision */
 
sg = scsi_sglist(cmd);
-   buffer = kmap_atomic(sg_page(sg)) + sg->offset;
+   buffer = sg_map(sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(buffer)) {
+   cmd->result = (DID_ERROR << 16);
+   cmd->scsi_done(cmd);
+   return;
+   }
 
memcpy(buffer, inqdata, sizeof(inqdata));
sg = scsi_sglist(cmd);
-   kunmap_atomic(buffer - sg->offset);
+   sg_unmap(sg, buffer, SG_KMAP_ATOMIC);
 
cmd->scsi_done(cmd);
}
diff --git a/drivers/scsi/ips.c b/drivers/scsi/ips.c
index 3419e1b..a44291d 100644
--- a/drivers/scsi/ips.c
+++ b/drivers/scsi/ips.c
@@ -1506,14 +1506,14 @@ static int ips_is_passthru(struct scsi_cmnd *SC)
 /* kmap_atomic() ensures addressability of the user buffer.*/
 /* local_irq_save() protects the KM_IRQ0 address slot. */
 local_irq_save(flags);
-buffer = kmap_atomic(sg_page(sg)) + sg->offset;
-if (buffer && buffer[0] == 'C' && buffer[1] == 'O' &&
+buffer = sg_map(sg, SG_KMAP_ATOMIC);
+if (!IS_ERR(buffer) && buffer[0] == 'C' && buffer[1] == 'O' &&
 buffer[2] == 'P' && buffer[3] == 'P') {
-kunmap_atomic(buffer - sg->offset);
+sg_unmap(sg, buffer, SG_KMAP_ATOMIC);
 local_irq_restore(flags);
 return 1;
 }
-kunmap_atomic(buffer - sg->offset);
+sg_unmap(sg, buffer, SG_KMAP_ATOMIC);
 local_irq_restore(flags);
}
return 0;
diff --git a/drivers/scsi/megaraid.c b/drivers/scsi/megaraid.c
index 3c63c29..0b66e50 100644
--- a/drivers/scsi/megaraid.c
+++ b/drivers/scsi/megaraid.c
@@ -663,10 +663,15 @@ mega_build_cmd(adapter_t *adapter, Scsi_Cmnd *cmd, int 
*busy)
struct scatterlist *sg;
 
sg = scsi_sglist(cmd);
-   buf = kmap_atomic(sg_page(sg)) + sg->offset;
+   buf = sg_map(sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(buf)) {
+cmd->result = (DID_ERROR << 16);
+   cmd->scsi_done(cmd);
+   return NULL;
+   }
 
memset(buf, 0, cmd->cmnd[4]);
-   kunmap_atomic(buf - sg->offset);
+   sg_unmap(sg, buf, SG_KMAP_ATOMIC);
 
cmd->result = (DID_OK << 16);
cmd->scsi_done(cmd);
-- 
2.1.4



[PATCH 11/22] RDS: Make use of the new sg_map helper function

2017-04-13 Thread Logan Gunthorpe
Straightforward conversion except there's no error path, so we WARN if
the sg_map fails.

Signed-off-by: Logan Gunthorpe <log...@deltatee.com>
---
 net/rds/ib_recv.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index e10624a..7f8fa99 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -801,9 +801,20 @@ static void rds_ib_cong_recv(struct rds_connection *conn,
to_copy = min(RDS_FRAG_SIZE - frag_off, PAGE_SIZE - map_off);
BUG_ON(to_copy & 7); /* Must be 64bit aligned. */
 
-   addr = kmap_atomic(sg_page(>f_sg));
+   addr = sg_map(>f_sg, SG_KMAP_ATOMIC);
+   if (IS_ERR(addr)) {
+   /*
+* This should really never happen unless
+* the code is changed to use memory that is
+* not mappable in the sg. Seeing there doesn't
+* seem to be any error path out of here,
+* we can only WARN.
+*/
+   WARN(1, "Non-mappable memory used in sg!");
+   return;
+   }
 
-   src = addr + frag->f_sg.offset + frag_off;
+   src = addr + frag_off;
dst = (void *)map->m_page_addrs[map_page] + map_off;
for (k = 0; k < to_copy; k += 8) {
/* Record ports that became uncongested, ie
@@ -811,7 +822,7 @@ static void rds_ib_cong_recv(struct rds_connection *conn,
uncongested |= ~(*src) & *dst;
*dst++ = *src++;
}
-   kunmap_atomic(addr);
+   sg_unmap(>f_sg, addr, SG_KMAP_ATOMIC);
 
copied += to_copy;
 
-- 
2.1.4



Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory

2017-04-13 Thread Logan Gunthorpe


On 12/04/17 03:55 PM, Benjamin Herrenschmidt wrote:
> Look at pcibios_resource_to_bus() and pcibios_bus_to_resource(). They
> will perform the conversion between the struct resource content (CPU
> physical address) and the actual PCI bus side address.

Ah, thanks for the tip! On my system, this translation returns the same
address so it was not necessary. And, yes, that means this would have to
find its way into the dma mapping routine somehow. This means we'll
eventually need a way to look-up the p2pmem device from the struct page.
Which means we will likely need a new flag bit in the struct page or
something. The big difficulty I see is testing. Do you know what
architectures or in what circumstances are these translations used?

> When behind the same switch you need to use PCI addresses. If one tries
> later to do P2P between host bridges (via the CPU fabric) things get
> more complex and one will have to use either CPU addresses or something
> else alltogether (probably would have to teach the arch DMA mapping
> routines to work with those struct pages you create and return the
> right thing).

Probably for starters we'd want to explicitly deny cases between host
bridges and add that later if someone wants to do the testing.

Thanks,

Logan


  1   2   >