[PATCH] IB/srp: Avoid using uninitialized variable

2015-06-25 Thread Sagi Grimberg
We might return res which is not initialized. Also reduce code duplication by exporting srp_parse_tmo so srp_tmo_set can reuse it. Detected by Coverity. Signed-off-by: Sagi Grimberg Signed-off-by: Jenny Falkovich --- drivers/infiniband/ulp/srp/ib_srp.c | 11 --- drivers/scsi

Re: [PATCH RFC 2/2] RDMA/isert: Support iWARP transport

2015-06-25 Thread Sagi Grimberg
On 6/25/2015 6:39 PM, Steve Wise wrote: Memory regions that are the target of an iWARP RDMA READ RESPONSE need REMOTE_WRITE access rights. So enable REMOTE_WRITE for iWARP devices. iWARP RDMA READ target sge depth is 1. So save the max_read_sge in the target device structure and use that when

Re: [PATCH RFC 1/2] RDMA/iser: limit sg tablesize on device fastreg max depth

2015-06-25 Thread Sagi Grimberg
O size and there I take into account the device capabilities (minimum between device capability and user preference). I was supposed to submit it once the indirect registration support lands in but given that will take some time, I'll go ahead and send it out as well. I have no problem rebasing

Re: [PATCH RFC 0/2] iSER support for iWARP

2015-06-25 Thread Sagi Grimberg
On 6/25/2015 6:39 PM, Steve Wise wrote: The following series implements support for iWARP transpors in the iSER initiator and target. This is based on Doug's k.o/for-4.2 branch. Hi Steve, Thanks for this set, Can you please rebase for target-pending/master? or at least submit on top of: is

Re: [PATCH RFC 2/2] RDMA/isert: Support iWARP transport

2015-06-27 Thread Sagi Grimberg
Resending - somehow this didn't make it to the lists... On 6/25/2015 7:51 PM, Sagi Grimberg wrote: On 6/25/2015 6:39 PM, Steve Wise wrote: Memory regions that are the target of an iWARP RDMA READ RESPONSE need REMOTE_WRITE access rights. So enable REMOTE_WRITE for iWARP devices. iWARP

Re: [PATCH RFC 2/2] RDMA/isert: Support iWARP transport

2015-06-27 Thread Sagi Grimberg
On 6/25/2015 8:06 PM, Steve Wise wrote: -Original Message- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Sagi Grimberg Sent: Thursday, June 25, 2015 11:51 AM To: Steve Wise; linux-rdma@vger.kernel.org Cc: Or Gerlitz; Roi Dayan; target

Re: [PATCH RFC 2/2] RDMA/isert: Support iWARP transport

2015-06-27 Thread Sagi Grimberg
On 6/25/2015 10:29 PM, Jason Gunthorpe wrote: On Thu, Jun 25, 2015 at 02:25:49PM -0500, Steve Wise wrote: To stage the changes we could introduce a new function that returns the needed ib_access_flags value given the desired opcodes. Then have a series that changes all the existing ULPs to mak

[PATCH] mlx4, mlx5, mthca: Expose max_sge_rd correctly

2015-06-29 Thread Sagi Grimberg
Applications must not assume that max_sge and max_sge_rd are the same, Hence expose max_sge_rd correctly as well. Reported-by: Steve Wise Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/mlx4/main.c| 1 + drivers/infiniband/hw/mlx5/main.c| 1 + drivers/infiniband

Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Sagi Grimberg
On 6/30/2015 12:36 AM, Steve Wise wrote: The semantics for MR access flags are not consistent across RDMA protocols. So rather than have applications try and glean what they need, have them pass in the intended roles and attributes for the MR to be allocated and let the RDMA core select the appr

Re: [PATCH V2 4/5] RDMA/iser: support iWARP devices

2015-06-30 Thread Sagi Grimberg
On 6/30/2015 12:36 AM, Steve Wise wrote: Limit the sg tablesize based on the device fast reg depth. Use rdma_get_dma_mr() to allocate the DMA MR. Use rdma_fast_reg_access_flags() to set the access_flags for fast register work requests. Steve, I wander if it would make more sense to get the i

Re: [PATCH V2 0/5] iSER support for iWARP

2015-06-30 Thread Sagi Grimberg
On 6/30/2015 12:36 AM, Steve Wise wrote: The following series implements support for iWARP transports in the iSER initiator and target. This is based on Doug's k.o/for-4.2 branch. I've tested this on cxgb4 and mlx4 hardware. Changes since V1: Introduce and use transport-independent RDMA core

Re: [PATCH 1/5] IB/core: Introduce Fast Indirect Memory Registration verbs API

2015-06-30 Thread Sagi Grimberg
On 6/8/2015 11:49 PM, Hefty, Sean wrote: Sean, IMO, we need to introduce vendor specific header files and interfaces. It is unmaintainable to drive an API from the bottom up and expose the 'bare metal' implementation of a bunch of disjoint pieces of hardware. (Yeah, because we need yet anot

Re: [PATCH 1/5] IB/core: Introduce Fast Indirect Memory Registration verbs API

2015-06-30 Thread Sagi Grimberg
On 6/30/2015 3:10 PM, Christoph Hellwig wrote: On Tue, Jun 30, 2015 at 02:47:00PM +0300, Sagi Grimberg wrote: Kernel 4.1 introduced the new pmem driver for byte addressable storage (https://lwn.net/Articles/640115/). It won't be long before we see HA models where secondary persistent m

Re: [PATCH V3 3/4] RDMA/iser: limit sg tablesize to device fastreg max depth

2015-07-01 Thread Sagi Grimberg
if (iscsi_host_add(shost, ib_conn->device->ib_device->dma_device)) { mutex_unlock(&iser_conn->state_mutex); You forgot to add my Reviewed-by on this. So again (for patchworks), Reviewed-by: Sagi Grimberg -- To unsubscr

Re: [PATCH V3 4/4] RDMA/isert: Support iWARP transport

2015-07-01 Thread Sagi Grimberg
int i; + + for (i = rdma_start_port(device->ib_device); +i <= rdma_end_port(device->ib_device); i++) + if (rdma_protocol_iwarp(device->ib_device, i)) + return 1; + return 0; +} + Lets get rid of that as soon as possible...

Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-07-01 Thread Sagi Grimberg
On 6/30/2015 8:10 PM, Hefty, Sean wrote: I suggest to start consolidating to ib_create_mr() that receives an extensible ib_mr_init_attr and additional attributes can be mr_roles and mr_attrs. I think this makes sense, but does it really help? If the end result is that the app and providers basi

Re: [PATCH V3 4/4] RDMA/isert: Support iWARP transport

2015-07-01 Thread Sagi Grimberg
On 7/2/2015 12:03 AM, Or Gerlitz wrote: On Wed, Jul 1, 2015 at 11:53 PM, Steve Wise wrote: From: Or Gerlitz [mailto:gerlitz...@gmail.com] Yes, the MR is a local MR, but it is used for REMOTE access for iWARP, but not IB. It think the reason is that in iWARP there is no distinction between

Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-07-02 Thread Sagi Grimberg
On 7/2/2015 4:17 PM, Steve Wise wrote: On 7/2/2015 1:22 AM, Sagi Grimberg wrote: On 6/30/2015 8:10 PM, Hefty, Sean wrote: I suggest to start consolidating to ib_create_mr() that receives an extensible ib_mr_init_attr and additional attributes can be mr_roles and mr_attrs. I think this makes

Re: [PATCH V4 3/5] RDMA/iser: Limit sg tablesize and max_sectors to device fastreg max depth

2015-07-05 Thread Sagi Grimberg
sg tablesize. Signed-off-by: Steve Wise Reviewed-by: Sagi Grimberg --- drivers/infiniband/ulp/iser/iscsi_iser.c |9 + 1 files changed, 9 insertions(+), 0 deletions(-) diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c index 6a594aa

Re: [PATCH V3 4/4] RDMA/isert: Support iWARP transport

2015-07-05 Thread Sagi Grimberg
On 7/2/2015 7:39 PM, Jason Gunthorpe wrote: On Thu, Jul 02, 2015 at 09:28:46AM +0300, Sagi Grimberg wrote: Or has a good point. The DMA mkey in target mode is discrete and not sent to any peer. That doesn't mean the peer cannot guess it. Using the right permission is clearly a str

Re: [PATCH V5 4/5] RDMA/isert: Set REMOTE_WRITE on DMA MRs to support iWARP devices

2015-07-06 Thread Sagi Grimberg
On 7/5/2015 8:45 PM, Steve Wise wrote: iWARP devices require REMOTE_WRITE for MRs used as the destination of an RDMA READ. So if the device protocol is iWARP, then set REMOTE_WRITE when allocating the DMA MR. Signed-off-by: Steve Wise Reviewed-by: Sagi Grimberg -- To unsubscribe from this

Re: [PATCH V5 3/5] RDMA/iser: Limit sg tablesize and max_sectors to device fastreg max depth

2015-07-06 Thread Sagi Grimberg
On 7/5/2015 8:44 PM, Steve Wise wrote: Currently the sg tablesize, which dictates fast register page list depth to use, does not take into account the limits of the rdma device. So adjust it once we discover the device fastreg max depth limit. Also adjust the max_sectors based on the resulting s

Re: [PATCH V5 5/5] RDMA/isert: Limit read depth based on the device max_sge_rd capability

2015-07-06 Thread Sagi Grimberg
On 7/5/2015 8:45 PM, Steve Wise wrote: Use the device's max_sge_rd capability to compute the target's read sge depth. Save both the read and write max_sge values in the isert_conn struct, and use these when creating RDMA_WRITE/READ work requests. Signed-off-by: Steve Wise Reviewe

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-06 Thread Sagi Grimberg
On 7/6/2015 2:22 AM, Steve Wise wrote: The semantics for MR access flags are not consistent across RDMA protocols. So rather than have applications try and glean what they need, have them pass in the intended roles and attributes for the MR to be allocated and let the RDMA core select the approp

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-06 Thread Sagi Grimberg
On 7/6/2015 2:22 AM, Steve Wise wrote: The semantics for MR access flags are not consistent across RDMA protocols. So rather than have applications try and glean what they need, have them pass in the intended roles and attributes for the MR to be allocated and let the RDMA core select the approp

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-06 Thread Sagi Grimberg
On 7/6/2015 5:37 PM, Steve Wise wrote: -Original Message- From: Sagi Grimberg [mailto:sa...@dev.mellanox.co.il] Sent: Monday, July 06, 2015 2:54 AM To: Steve Wise; dledf...@redhat.com Cc: sa...@mellanox.com; ogerl...@mellanox.com; r...@mellanox.com; linux-rdma@vger.kernel.org; e

Re: [PATCH V5 3/5] RDMA/iser: Limit sg tablesize and max_sectors to device fastreg max depth

2015-07-06 Thread Sagi Grimberg
On 7/6/2015 5:35 PM, Steve Wise wrote: -Original Message- From: Sagi Grimberg [mailto:sa...@dev.mellanox.co.il] Sent: Monday, July 06, 2015 2:51 AM To: Steve Wise; dledf...@redhat.com Cc: infinip...@intel.com; sa...@mellanox.com; ogerl...@mellanox.com; r...@mellanox.com; linux-rdma

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-07 Thread Sagi Grimberg
On 7/7/2015 12:00 PM, Christoph Hellwig wrote: On Mon, Jul 06, 2015 at 07:17:38PM +0300, Sagi Grimberg wrote: Ok. I'll remove all uses of ib_get_dma_mr()... I meant that rdma_get_dma_mr can go away. I'd prefer to get the needed access_flags and just call existing verb. I strongl

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-07 Thread Sagi Grimberg
On 7/7/2015 7:17 PM, Jason Gunthorpe wrote: On Tue, Jul 07, 2015 at 09:05:15AM -0500, Steve Wise wrote: I took the feedback from Christoph and Jason to mean I should remove ib_get_dma_mr() entirely and pull its guts into rdma_get_dma_mr(), and change all the users of ib_get_dma_mr() to use rdma

Re: [PATCH V5 3/5] RDMA/iser: Limit sg tablesize and max_sectors to device fastreg max depth

2015-07-07 Thread Sagi Grimberg
On 7/7/2015 6:41 PM, Steve Wise wrote: -Original Message- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Or Gerlitz Sent: Tuesday, July 07, 2015 9:32 AM To: Steve Wise; 'Sagi Grimberg' Cc: dledf...@redhat.com; infinip...@in

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread Sagi Grimberg
On 7/8/2015 12:36 AM, Jason Gunthorpe wrote: On Tue, Jul 07, 2015 at 07:27:47PM +0300, Sagi Grimberg wrote: Doesn't it look odd to you? Sure, but the oddness is that rdma_device_access_flags exists at all, not the wrapper. The wrapper is what we want the API to look like, I

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread Sagi Grimberg
On 7/8/2015 11:13 AM, 'Christoph Hellwig' wrote: On Wed, Jul 08, 2015 at 10:29:56AM +0300, Sagi Grimberg wrote: I don't necessarily agree. The API we'd want is a single API at all the call sites to all types of MRs. We have different QP types, and still we don't have a

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread Sagi Grimberg
On 7/8/2015 1:20 PM, 'Christoph Hellwig' wrote: On Wed, Jul 08, 2015 at 01:05:28PM +0300, Sagi Grimberg wrote: If we agree to consolidate on a single MR allocation API, I don't see how this wrapper is moving us forward. But if you guys prefer to have it than I don't h

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-09 Thread Sagi Grimberg
On 7/8/2015 8:14 PM, Hefty, Sean wrote: I am still not clear if all of us agree that we need it. Sean and Steve had some disclaimers... A single entry point doesn't help a whole lot if the app must deal with different behavior based on how the API is used. It is true that different MRs will

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-09 Thread Sagi Grimberg
On 7/8/2015 11:32 PM, 'Christoph Hellwig' wrote: On Wed, Jul 08, 2015 at 01:08:42PM -0600, Jason Gunthorpe wrote: Then, what is left is all remote MRs and maybe it will be clearer what to do about them then... From looking at that for a while the APIs needed seem pretty simple to me from a co

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-09 Thread Sagi Grimberg
On 7/9/2015 3:03 AM, Jason Gunthorpe wrote: On Wed, Jul 08, 2015 at 01:32:05PM -0700, 'Christoph Hellwig' wrote: On Wed, Jul 08, 2015 at 01:08:42PM -0600, Jason Gunthorpe wrote: Then, what is left is all remote MRs and maybe it will be clearer what to do about them then... From looking at th

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-09 Thread Sagi Grimberg
On 7/9/2015 2:36 AM, Jason Gunthorpe wrote: I'm arguing upper layer protocols should never even see local memory registration, that it is totally irrelevant to them. So yes, you can call that a common approach to memory registration if you like.. Basically it appears there is nothing that NFS

kernel memory registration (was: RDMA/core: Transport-independent access flags)

2015-07-10 Thread Sagi Grimberg
On 7/9/2015 8:01 PM, Jason Gunthorpe wrote: On Thu, Jul 09, 2015 at 02:02:03PM +0300, Sagi Grimberg wrote: We have protocol that involves remote memory keys transfer in their standards so I don't see how we can remove it altogether from ULPs. This is why I've been talking about

Kernel fast memory registration API proposal [RFC]

2015-07-10 Thread Sagi Grimberg
Hi, Given the last discussions on our in-kernel memory registration API I thought I'd propose another approach to address this. As I said before, I think the stack needs to consolidate on a single memory registration scheme. That scheme is the standard FRWR. As you know, MRs have a consumers re

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-12 Thread Sagi Grimberg
On 7/10/2015 10:34 PM, Christoph Hellwig wrote: On Thu, Jul 09, 2015 at 09:52:59AM -0400, Chuck Lever wrote: There is one remaining kernel user of ib_reg_phys_mr() in 4.2: Lustre. It's in the staging tree, which proper in-tree code doesn't have to cater for. So as soon as sunrpc is done using

Re: Kernel fast memory registration API proposal [RFC]

2015-07-12 Thread Sagi Grimberg
On 7/11/2015 1:39 PM, Christoph Hellwig wrote: On Fri, Jul 10, 2015 at 12:09:37PM +0300, Sagi Grimberg wrote: And then provide helpers to populate the MR with generic kernel structures such as struct scatterlist (for scsi and other ULPs), struct page (for NFS) or struct bio_vec (for block ULPs

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-12 Thread Sagi Grimberg
On 7/11/2015 7:37 PM, Steve Wise wrote: On Jul 10, 2015, at 9:11 AM, Jason Gunthorpe wrote: On Fri, Jul 10, 2015 at 09:22:24AM -0400, Tom Talpey wrote: and it is enabled only when the RDMA Read is active. ??? How is that done? ib_get_dma_mr is defined to return a remote usable rkey that i

Re: [PATCH v1 03/12] xprtrdma: Increase default credit limit

2015-07-12 Thread Sagi Grimberg
advertised credit limit. Signed-off-by: Chuck Lever Looks good, Reviewed-By: Sagi Grimberg -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 04/12] xprtrdma: Remove last ib_reg_phys_mr() call site

2015-07-12 Thread Sagi Grimberg
ys_mr() code path in rpcrdma_register_internal(), so it can be removed. The remaining logic in rpcrdma_{de}register_internal() is folded into rpcrdma_{alloc,free}_regbuf(). Signed-off-by: Chuck Lever Like, Reviewed-By: Sagi Grimberg -- To unsubscribe from this list: send the line "unsubsc

Re: [PATCH v1 02/12] xprtrdma: Raise maximum payload size to one megabyte

2015-07-12 Thread Sagi Grimberg
On 7/9/2015 11:41 PM, Chuck Lever wrote: The point of larger rsize and wsize is to reduce the per-byte cost of memory registration and deregistration. Modern HCAs can typically handle a megabyte or more with a single registration operation. Reviewed-By: Sagi Grimberg -- To unsubscribe from

Re: [PATCH v1 05/12] xprtrdma: Account for RPC/RDMA header size when deciding to inline

2015-07-12 Thread Sagi Grimberg
On 7/9/2015 11:42 PM, Chuck Lever wrote: When marshaling RPC/RDMA requests, ensure the combined size of RPC/RDMA header and RPC header do not exceed the inline threshold. Endpoints typically reject RPC/RDMA messages that exceed the size of their receive buffers. Did this solve a bug? because is

Re: [PATCH v1 06/12] xprtrdma: Always provide a write list when sending NFS READ

2015-07-12 Thread Sagi Grimberg
erred via RDMA. Using the write list, the data payload is moved by the device and no extra data copying is necessary. Signed-off-by: Chuck Lever Reviewed-By: Sagi Grimberg -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@

Re: [PATCH v1 07/12] xprtrdma: Don't provide a reply chunk when expecting a short reply

2015-07-12 Thread Sagi Grimberg
On 7/9/2015 11:42 PM, Chuck Lever wrote: Currently Linux always offers a reply chunk, even for small replies (unless a read or write list is needed for the RPC operation). A comment in rpcrdma_marshal_req() reads: Currently we try to not actually use read inline. Reply chunks have the desirabl

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread Sagi Grimberg
On 7/13/2015 7:50 PM, Jason Gunthorpe wrote: On Sun, Jul 12, 2015 at 10:49:08AM +0300, Sagi Grimberg wrote: On 7/10/2015 10:34 PM, Christoph Hellwig wrote: On Thu, Jul 09, 2015 at 09:52:59AM -0400, Chuck Lever wrote: There is one remaining kernel user of ib_reg_phys_mr() in 4.2: Lustre

Re: kernel memory registration

2015-07-14 Thread Sagi Grimberg
Having a few schemes availabe in the core code that the driver can chose from seems like a much more sensible option. I think that makes sense, but several of the schemes we are working with are effectively single-vendor schemes. Indirect MR and DIX are good examples of things that only one

Re: Kernel fast memory registration API proposal [RFC]

2015-07-14 Thread Sagi Grimberg
On 7/13/2015 7:30 PM, Jason Gunthorpe wrote: On Fri, Jul 10, 2015 at 12:09:37PM +0300, Sagi Grimberg wrote: Given the last discussions on our in-kernel memory registration API I thought I'd propose another approach to address this. I assume you can put your new indirect registrations

Re: Kernel fast memory registration API proposal [RFC]

2015-07-14 Thread Sagi Grimberg
On 7/13/2015 5:16 PM, Chuck Lever wrote: NFS really should be using something more similar to a scatterlist, as it maps pretty well to the sk_frags in the network layer as well. Struct scatterlist is imprtant because it's the way the DMA mapping functions takes a multi-page argument, so ayone w

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread Sagi Grimberg
On 7/14/2015 10:25 AM, 'Christoph Hellwig' wrote: On Mon, Jul 13, 2015 at 10:57:48AM -0600, Jason Gunthorpe wrote: Currently various drivers are using ib_get_dma_mr with remote flags unfortunately, e.g. the SRP initiator driver uses it to optimize away memory registrtions for single SGL entry re

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread Sagi Grimberg
On 7/13/2015 11:15 PM, Jason Gunthorpe wrote: On Mon, Jul 13, 2015 at 03:36:44PM -0400, Tom Talpey wrote: On 7/11/2015 6:25 AM, 'Christoph Hellwig' wrote: I think what we need to support for now is FRMR as the primary target, and FMR as a secondar[y]. FMR is a *very* bad choice, for several r

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread Sagi Grimberg
On 7/14/2015 10:37 AM, 'Christoph Hellwig' wrote: On Mon, Jul 13, 2015 at 03:36:44PM -0400, Tom Talpey wrote: On 7/11/2015 6:25 AM, 'Christoph Hellwig' wrote: I think what we need to support for now is FRMR as the primary target, and FMR as a secondar[y]. FMR is a *very* bad choice, for sever

Re: [PATCH v1 07/12] xprtrdma: Don't provide a reply chunk when expecting a short reply

2015-07-14 Thread Sagi Grimberg
On 7/12/2015 9:38 PM, Chuck Lever wrote: Hi Sagi- On Jul 12, 2015, at 10:58 AM, Sagi Grimberg wrote: On 7/9/2015 11:42 PM, Chuck Lever wrote: Currently Linux always offers a reply chunk, even for small replies (unless a read or write list is needed for the RPC operation). A comment in

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread Sagi Grimberg
On 7/14/2015 3:24 PM, Tom Talpey wrote: On 7/14/2015 4:06 AM, Sagi Grimberg wrote: All protocols cares about transferring data and sending messages, so it's not a good enough reason for a poor registration method choice. This just emphasizes why we need to converge to a single method.

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread Sagi Grimberg
On 7/14/2015 3:12 PM, Tom Talpey wrote: On 7/14/2015 5:22 AM, Sagi Grimberg wrote: On 7/14/2015 10:37 AM, 'Christoph Hellwig' wrote: On Mon, Jul 13, 2015 at 03:36:44PM -0400, Tom Talpey wrote: On 7/11/2015 6:25 AM, 'Christoph Hellwig' wrote: I think what we need to suppor

Re: Kernel fast memory registration API proposal [RFC]

2015-07-14 Thread Sagi Grimberg
On 7/14/2015 6:33 PM, Christoph Hellwig wrote: On Tue, Jul 14, 2015 at 11:39:24AM +0300, Sagi Grimberg wrote: This is exactly what I don't want to do. I don't think that implicit posting is a good idea for reasons that I mentioned earlier: "This is where I have a problem. Provid

Re: Kernel fast memory registration API proposal [RFC]

2015-07-14 Thread Sagi Grimberg
I'm really disappointed by the negative emails on this subject.. Jason, I'm really not trying to be negative. I'm hearing you out, and I agree with a lot of what you have to say. I just don't agree with all of it. You are right, ULPs do the same thing, the same wrong thing of maintaining a fal

Re: Kernel fast memory registration API proposal [RFC]

2015-07-14 Thread Sagi Grimberg
On 7/14/2015 7:35 PM, Jason Gunthorpe wrote: On Tue, Jul 14, 2015 at 07:12:01PM +0300, Sagi Grimberg wrote: The ULP doesn't care if it needs to reserver the slot, and it generally doesn't care about the notification either unless it needs to handle an error. That's generally c

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-15 Thread Sagi Grimberg
On 7/14/2015 8:26 PM, Jason Gunthorpe wrote: On Tue, Jul 14, 2015 at 12:05:53PM +0300, Sagi Grimberg wrote: iser has it too. I have a similar patch with a flag for iser (its behind a bulk of patches that are still pending though). Do we all agree and understand that stuff like this in

Re: Kernel fast memory registration API proposal [RFC]

2015-07-15 Thread Sagi Grimberg
On 7/14/2015 8:09 PM, Jason Gunthorpe wrote: On Tue, Jul 14, 2015 at 07:55:39PM +0300, Sagi Grimberg wrote: But, if people think that it's better to have an API that does implicit posting always without notification, and then silently consume error or flush completions. I can try and lo

Re: Kernel fast memory registration API proposal [RFC]

2015-07-15 Thread Sagi Grimberg
On 7/15/2015 10:32 AM, Christoph Hellwig wrote: Hi Sagi, I went over your proposal based on reviewing the ongoing MR threads and my implementation of a similar in-driver abstraction, so here are some proposed updates. struct provider_mr { u64 *page_list; // or what ever the

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-15 Thread Sagi Grimberg
On 7/14/2015 11:29 PM, Jason Gunthorpe wrote: On Tue, Jul 14, 2015 at 12:55:11PM -0700, 'Christoph Hellwig' wrote: On Tue, Jul 14, 2015 at 02:32:31PM -0500, Steve Wise wrote: You mean "should not", yea? Ok. I'll check for iWARP. But don't tell me to remove the transport-specific hacks in th

Re: Kernel fast memory registration API proposal [RFC]

2015-07-15 Thread Sagi Grimberg
On 7/15/2015 6:05 AM, Doug Ledford wrote: On 07/14/2015 01:08 PM, Jason Gunthorpe wrote: On Tue, Jul 14, 2015 at 07:46:50PM +0300, Sagi Grimberg wrote: Which drivers doesn't support FRWR that we need to do other things? ipath - depracated We have permission to move this to staging and

Re: [Ksummit-discuss] [TECH TOPIC] IRQ affinity

2015-07-15 Thread Sagi Grimberg
On 7/15/2015 8:25 PM, Jens Axboe wrote: On 07/15/2015 11:19 AM, Keith Busch wrote: On Wed, 15 Jul 2015, Bart Van Assche wrote: * With blk-mq and scsi-mq optimal performance can only be achieved if the relationship between MSI-X vector and NUMA node does not change over time. This is necessary

Re: Kernel fast memory registration API proposal [RFC]

2015-07-15 Thread Sagi Grimberg
On 7/15/2015 5:32 PM, Chuck Lever wrote: On Jul 15, 2015, at 4:01 AM, Sagi Grimberg wrote: On 7/14/2015 8:09 PM, Jason Gunthorpe wrote: On Tue, Jul 14, 2015 at 07:55:39PM +0300, Sagi Grimberg wrote: But, if people think that it's better to have an API that does implicit posting a

Re: Kernel fast memory registration API proposal [RFC]

2015-07-16 Thread Sagi Grimberg
On 7/16/2015 11:07 AM, Christoph Hellwig wrote: On Thu, Jul 16, 2015 at 09:52:44AM +0300, Sagi Grimberg wrote: I suggest to start with what I proposed. And in a later stage (if we still think its needed) we can have a higher level API that hides the post, something like: rdma_reg_sg(struct

Re: Kernel fast memory registration API proposal [RFC]

2015-07-16 Thread Sagi Grimberg
On 7/15/2015 8:07 PM, Jason Gunthorpe wrote: On Wed, Jul 15, 2015 at 12:32:33AM -0700, Christoph Hellwig wrote: int rdma_create_mr(struct ib_pd *pd, enum rdma_mr_type mr, u32 max_pages, int flags); * array from a SG list * @mr: memory region * @sg: sg lis

Re: Kernel fast memory registration API proposal [RFC]

2015-07-16 Thread Sagi Grimberg
I can drop it, unless anyone can think of a use-case where a ULP would want to register a region with a different offset from sg[0]->offset and/or ends before the sum(sg->length). What if the sg list has to be chunked up due to the device's FRWR pbl depth limits? Or is that handled underneat

Re: Kernel fast memory registration API proposal [RFC]

2015-07-18 Thread Sagi Grimberg
On 7/16/2015 9:08 PM, Jason Gunthorpe wrote: On Thu, Jul 16, 2015 at 03:21:04PM +0300, Sagi Grimberg wrote: I gotta say, these suggestions of bool/write or supported_ops with a convert helper seem (to me at least) to make things more complicated. Why not just set the the access_flags as they

Re: Kernel fast memory registration API proposal [RFC]

2015-07-18 Thread Sagi Grimberg
/** * ib_mr_set_sg() - populate memory region buffers * array from a SG list * @mr: memory region * @sg: sg list * @sg_nents:number of elements in the sg * * Can fail if the HW is not able to register this * sg list. In case of failure - caller i

Re: RFC: Immediate data support for SRP

2015-07-19 Thread Sagi Grimberg
On 7/16/2015 6:25 PM, Bart Van Assche wrote: Hello, Hi Bart, I agree it would definitely help as the lack of immediate data emphasizes the additional latency of doing rdma reads. As you probably know for write requests "immediate data" means sending the data in the same packet as the write

Re: RFC: Immediate data support for SRP

2015-07-20 Thread Sagi Grimberg
On 7/20/2015 12:43 AM, Or Gerlitz wrote: On Sun, Jul 19, 2015 at 7:07 PM, Sagi Grimberg wrote: On 7/16/2015 6:25 PM, Bart Van Assche wrote: I agree it would definitely help as the lack of immediate data emphasizes the additional latency of doing rdma reads. Sagi, do we have black box

Re: Kernel fast memory registration API proposal [RFC]

2015-07-20 Thread Sagi Grimberg
I'm thinking now that this should have an input argument of block_size. Maybe in the future ULPs would want to register huge pages, it will be a shame to map it into PAGE_SIZE chunks... Why wouldn't it just transparently support huge pages? sg seems to have enough information. I'm not sure I k

Re: Kernel fast memory registration API proposal [RFC]

2015-07-20 Thread Sagi Grimberg
On 7/20/2015 7:23 PM, Jason Gunthorpe wrote: On Sun, Jul 19, 2015 at 08:33:24AM +0300, Sagi Grimberg wrote: I was thinking that the user won't explicitly say which key it registers and it will be decided from the registration itself. Meaning, the registration code will do: Please don

[PATCH] mlx5: Fix missing device local_dma_lkey

2015-07-20 Thread Sagi Grimberg
The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY but does not set the the device local_dma_lkey. This breaks rpcrdma drivers. Query and set this lkey when creating the device resources. Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/mlx5/main.c| 9

[PATCH RFC] svcrdma: Fix possible over population fast_reg_page_list

2015-07-20 Thread Sagi Grimberg
When accounting the needed_pages, we need to look into the page_list->max_page_list_len and not the global context xprt->sc_frmr_pg_list_len. Signed-off-by: Sagi Grimberg --- net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git

Re: Kernel fast memory registration API proposal [RFC]

2015-07-20 Thread Sagi Grimberg
On 7/20/2015 8:00 PM, Jason Gunthorpe wrote: On Mon, Jul 20, 2015 at 07:27:52PM +0300, Sagi Grimberg wrote: I'm thinking now that this should have an input argument of block_size. Maybe in the future ULPs would want to register huge pages, it will be a shame to map it into PAGE_SIZE c

Re: [PATCH] mlx5: Fix missing device local_dma_lkey

2015-07-20 Thread Sagi Grimberg
On 7/20/2015 8:08 PM, Chuck Lever wrote: On Jul 20, 2015, at 12:54 PM, Sagi Grimberg wrote: The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY but does not set the the device local_dma_lkey. This breaks rpcrdma drivers. Query and set this lkey when creating the device

Re: [PATCH RFC] svcrdma: Fix possible over population fast_reg_page_list

2015-07-20 Thread Sagi Grimberg
On 7/20/2015 8:13 PM, Chuck Lever wrote: On Jul 20, 2015, at 1:00 PM, Sagi Grimberg wrote: When accounting the needed_pages, we need to look into the page_list->max_page_list_len and not the global context xprt->sc_frmr_pg_list_len. Signed-off-by: Sagi Grimberg --- net/sunrpc/xp

Re: RFC: Immediate data support for SRP

2015-07-21 Thread Sagi Grimberg
On 7/21/2015 3:03 AM, Bart Van Assche wrote: On 07/19/2015 09:07 AM, Sagi Grimberg wrote: On 7/16/2015 6:25 PM, Bart Van Assche wrote: As you probably know for write requests "immediate data" means sending the data in the same packet as the write command instead of sending it as

Re: RFC: Immediate data support for SRP

2015-07-21 Thread Sagi Grimberg
So you have 140% better IOPS with immediate-data vs. non immediate data?! numberz? No, the improvement was to avoid memory copy from the pre-posted recieve buffer (with immediate-data) to an allocated buffer. Instead the receive buffer is handed to the backend to do IO. This shows up to 40%

Re: Kernel fast memory registration API proposal [RFC]

2015-07-21 Thread Sagi Grimberg
Bleh... seems like a great effort just to find that out. Isn't it better to just ask for a page_size arg? So who computes page_size and how? Don't just punt things to a caller without really explaining how the caller is supposed to use it correctly. I'd imagine that the ULP knows when it regi

[PATCH] mlx5: Expose correct page_size_cap in device attributes

2015-07-21 Thread Sagi Grimberg
Should be all the page sizes that are supported by the device. Reported-by: Jason Gunthorpe Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/mlx5/main.c |3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw

[PATCH WIP 00/43] New fast registration API

2015-07-21 Thread Sagi Grimberg
;s own function which will allow them to lose their page list duplication. I haven't done that yet. Comments and review are welcomed (and needed!). Sorry for the long series, but it's kinda transverse... The code/patches can be found in: https://github.com/sagigrimberg/linux/tree

[PATCH WIP 01/43] IB: Modify ib_create_mr API

2015-07-21 Thread Sagi Grimberg
Use ib_alloc_mr with specific parameters. Change the existing callers. Signed-off-by: Sagi Grimberg --- drivers/infiniband/core/verbs.c | 20 -- drivers/infiniband/hw/mlx5/main.c| 2 +- drivers/infiniband/hw/mlx5/mlx5_ib.h | 6 -- drivers/infiniband/hw

[PATCH WIP 07/43] qib: Support ib_alloc_mr verb

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/qib/qib_mr.c| 23 +++ drivers/infiniband/hw/qib/qib_verbs.c | 1 + drivers/infiniband/hw/qib/qib_verbs.h | 5 + 3 files changed, 29 insertions(+) diff --git a/drivers/infiniband/hw/qib/qib_mr.c b/drivers

[PATCH WIP 06/43] nes: Support ib_alloc_mr verb

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/nes/nes_verbs.c | 73 +++ 1 file changed, 73 insertions(+) diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c index fbc43e5..ac63763 100644 --- a/drivers/infiniband/hw

[PATCH WIP 03/43] ocrdma: Support ib_alloc_mr verb

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1 + drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 47 + drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 4 +++ 3 files changed, 52 insertions(+) diff --git a/drivers/infiniband/hw/ocrdma

[PATCH WIP 05/43] cxgb3: Support ib_alloc_mr verb

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/cxgb3/iwch_provider.c | 53 + 1 file changed, 53 insertions(+) diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c index b1b7323..d0e9e2d 100644 --- a/drivers

[PATCH WIP 08/43] IB/iser: Convert to ib_alloc_mr

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/ulp/iser/iser_verbs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c index 6be4d4a..ecc3265 100644 --- a/drivers/infiniband/ulp/iser

[PATCH WIP 04/43] iw_cxgb4: Support ib_alloc_mr verb

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 4 +++ drivers/infiniband/hw/cxgb4/mem.c | 57 ++ drivers/infiniband/hw/cxgb4/provider.c | 1 + 3 files changed, 62 insertions(+) diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h

[PATCH WIP 02/43] IB/mlx4: Support ib_alloc_mr verb

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/mlx4/main.c| 1 + drivers/infiniband/hw/mlx4/mlx4_ib.h | 4 drivers/infiniband/hw/mlx4/mr.c | 38 3 files changed, 43 insertions(+) diff --git a/drivers/infiniband/hw/mlx4/main.c b

[PATCH WIP 15/43] ocrdma: Drop ocrdma_alloc_frmr

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/ocrdma/ocrdma_main.c | 1 - drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 41 - drivers/infiniband/hw/ocrdma/ocrdma_verbs.h | 1 - 3 files changed, 43 deletions(-) diff --git a/drivers/infiniband/hw/ocrdma

[PATCH WIP 10/43] IB/srp: Convert to ib_alloc_mr

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/ulp/srp/ib_srp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 1218738..7747587 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b

[PATCH WIP 14/43] mlx4: Drop mlx4_ib_alloc_fast_reg_mr

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/mlx4/main.c| 1 - drivers/infiniband/hw/mlx4/mlx4_ib.h | 2 -- drivers/infiniband/hw/mlx4/mr.c | 33 - 3 files changed, 36 deletions(-) diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers

[PATCH WIP 13/43] mlx5: Drop mlx5_ib_alloc_fast_reg_mr

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/hw/mlx5/main.c| 1 - drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 -- drivers/infiniband/hw/mlx5/mr.c | 44 3 files changed, 47 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers

[PATCH WIP 09/43] iser-target: Convert to ib_alloc_mr

2015-07-21 Thread Sagi Grimberg
Signed-off-by: Sagi Grimberg --- drivers/infiniband/ulp/isert/ib_isert.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index f0b7c9b..94395ce 100644 --- a/drivers/infiniband/ulp/isert

  1   2   3   4   5   6   7   8   9   10   >