Supported uverbs opcodes?

2015-08-19 Thread Christoph Hellwig
What opcodes are supposed to be submitted by users? Currently we do not define opcodes in the UAPI and kinda rely that userspace uses the same ones as the kernel. For thos defines by libibverbs (RDMA_WRITE, RDMA_WRITE_WITH_IMM, SEND, SEND_WITH_IMM, RDMA_READ, ATOMIC_CMP_AND_SWP and ATOMIC_FETCH_A

Re: [PATCH WIP 28/43] IB/core: Introduce new fast registration API

2015-08-19 Thread Christoph Hellwig
On Wed, Aug 19, 2015 at 02:56:24PM +0300, Sagi Grimberg wrote: > So I had a go with moving the DMA mapping into ib_map_mr_sg() and > it turns out mapping somewhat poorly if the ULP _may_ register memory > or just send sg_lists (like storage targets over IB/iWARP). So the ULP > will sometimes use th

Re: [RFC] split struct ib_send_wr

2015-08-17 Thread Christoph Hellwig
On Thu, Aug 13, 2015 at 09:04:39AM -0700, Christoph Hellwig wrote: > > > I'm happy to do that if you're fine with the patch in general. amso1100 > > > should be trivial anyway, while ipath is a mess, just like the new intel > > > driver with the third copy o

Re: [RFC] split struct ib_send_wr

2015-08-13 Thread Christoph Hellwig
On Thu, Aug 13, 2015 at 11:22:34AM -0600, Jason Gunthorpe wrote: > The uverbs change needs to drop/move the original kmalloc: > > next = kmalloc(ALIGN(sizeof *next, sizeof (struct ib_sge)) + > user_wr->num_sge * sizeof (struct ib_sge), >

Re: [RFC] split struct ib_send_wr

2015-08-13 Thread Christoph Hellwig
On Thu, Aug 13, 2015 at 09:07:14AM -0400, Doug Ledford wrote: > > Doug: was your mail a request to fix up the two de-staged drivers? > > I'm happy to do that if you're fine with the patch in general. amso1100 > > should be trivial anyway, while ipath is a mess, just like the new intel > > driver

Re: [RFC] split struct ib_send_wr

2015-08-13 Thread Christoph Hellwig
On Wed, Aug 12, 2015 at 08:24:49PM +0300, Sagi Grimberg wrote: > Just a nit that I've noticed, in mlx4 set_fmr_seg params are not > aligned to the parenthesis (maybe in other locations too but I haven't > noticed such...) This is just using a normal two tab indent for continued function parameters

Re: [RFC] split struct ib_send_wr

2015-08-12 Thread Christoph Hellwig
On Wed, Aug 12, 2015 at 07:24:44PM -0700, Chuck Lever wrote: > That makes sense, but you already Acked the change that breaks Lustre, > and it's going in through the NFS tree. Are you changing that to a NAK? It seems like Doug was mostly concened about to be removed drivers. I defintively refuse t

Re: [PATCH for-4.3 11/15] iw_cxgb4: Support ib_alloc_mr verb

2015-08-07 Thread 'Christoph Hellwig'
On Fri, Aug 07, 2015 at 11:29:12AM -0500, Steve Wise wrote: > I misspoke. I had the order reversed. The order is such that we can add my > new NFS patch after: > > e20684a xprtrdma, svcrdma: Convert to ib_alloc_mr > > and before these: > > af78181 cxgb3: Support ib_alloc_mr verb > b7e06cd iw_c

Re: [PATCH for-4.3 11/15] iw_cxgb4: Support ib_alloc_mr verb

2015-08-07 Thread 'Christoph Hellwig'
On Fri, Aug 07, 2015 at 11:19:59AM -0500, Steve Wise wrote: > I guess I'll post two patches, the NFS fix that preceeds af78181/ b7e06cd, > and a reworked patch to replace e20684a. > > Is that the way to go in your opinion? To me this sounds good. We have a couple patches from Jason's series tha

Re: [PATCH for-4.3 11/15] iw_cxgb4: Support ib_alloc_mr verb

2015-08-07 Thread Christoph Hellwig
On Fri, Aug 07, 2015 at 10:06:26AM -0500, Steve Wise wrote: > If it is too much of a pain to alter this patch, then I'll just > submit the NFSRDMA fix and live with the bisect issue... Doug's tree is still to be rebased. So please submit your NFS fix now as ask Doug to merge it before Sagi's seri

Re: [RFC] split struct ib_send_wr

2015-08-07 Thread Christoph Hellwig
On Fri, Aug 07, 2015 at 10:17:18AM -0400, Chuck Lever wrote: > If bot barking doesn't bother anyone, then I'll keep the removal patch. > For some such a complaint might be grounds for rejecting the patch. If it's (a) in tree proper and (b) not one of the rare false positives I would consider it a

Re: [RFC] split struct ib_send_wr

2015-08-07 Thread Christoph Hellwig
On Thu, Aug 06, 2015 at 07:46:44PM +0300, Sagi Grimberg wrote: > I agree that this is a shame to keep in here for everyone to carry... > The only driver I've seen supporting XRC is mlx5 with no consumers. > > If people are reluctant to remove it, you can put it in ib_xrc_send_wr > or something...

Re: [RFC] split struct ib_send_wr

2015-08-06 Thread Christoph Hellwig
On Thu, Aug 06, 2015 at 01:58:45PM -0400, Chuck Lever wrote: > Wondering if this means we'll have to drop ib_reg_phys_mr() > removal until Lustre gets around to removing their call sites > from the staging tree. Why? Just because the buildbot catches it? -- To unsubscribe from this list: send th

Re: [RFC] split struct ib_send_wr

2015-08-06 Thread 'Christoph Hellwig'
On Thu, Aug 06, 2015 at 12:44:42PM -0500, Steve Wise wrote: > > Driver/staging isn't considered in tree for global API change > > perspective, so I didn't bother with all these staging drivers. > > The kbuild test bot will probably catch this. It already did catch it for my tree, which is expect

Re: [RFC] split struct ib_send_wr

2015-08-06 Thread Christoph Hellwig
On Thu, Aug 06, 2015 at 11:08:45PM +0530, Parav Pandit wrote: > Do you see value in dividing ib_ud _wr into ib_ud_wr and ib_ud_gsi_wr > to save 4 bytes? For now I just wanted to split along the lines of the existing unions. >From looking at the various drivers splitting the GSI path might not be a

Re: [RFC] split struct ib_send_wr

2015-08-06 Thread Christoph Hellwig
On Thu, Aug 06, 2015 at 12:04:32PM -0500, Steve Wise wrote: > You missed amso1100 (and probably ipath) that have been moved to > drivers/staging... Driver/staging isn't considered in tree for global API change perspective, so I didn't bother with all these staging drivers. -- To unsubscribe from t

Re: [RFC] split struct ib_send_wr

2015-08-06 Thread Christoph Hellwig
I've pushed out a new version. Updates: - the ib_recv_wr change Bart notices has been fixed. - iser and isert have been converted - the handling of the embedded WR in the qib software queue entry has been fixed. Which means we're basically done now and the patch could use broader testing.

Re: [RFC] split struct ib_send_wr

2015-08-06 Thread Christoph Hellwig
On Wed, Aug 05, 2015 at 10:40:08PM -0600, Jason Gunthorpe wrote: > Any numbers on the struct size reduction? sizeof(struct ib_send_wr) (old): 96 sizeof(struct ib_send_wr): 48 sizeof(struct ib_rdma_wr): 64 sizeof(struct ib_atomic_wr): 96 sizeof(struct ib_ud_wr): 88 sizeof(struct ib_fast_reg_wr): 8

Re: [PATCH, RFC] rdma: split struct ib_send_wr

2015-08-04 Thread Christoph Hellwig
On Tue, Aug 04, 2015 at 08:44:26PM +0300, Sagi Grimberg wrote: > I do agree that the size on the stack is less of an issue now. What > still can matter is handling each wr one by one vs. doing a collective > post. But if structured correctly you can still do that with on-stack WRs. > I can unders

Re: [PATCH, RFC] rdma: split struct ib_send_wr

2015-08-04 Thread Christoph Hellwig
On Tue, Aug 04, 2015 at 08:06:16PM +0300, Sagi Grimberg wrote: > Question though, a ULP may want to keep a couple of WRs around instead > of having each allocated in the stack and handled one by one. We need > to provide it with a hint of what is the size it needs. Note that with the drastic shrin

Re: [RFC] split struct ib_send_wr

2015-08-04 Thread Christoph Hellwig
On Tue, Aug 04, 2015 at 09:36:49AM -0700, Bart Van Assche wrote: > >diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h > > [ ... ] > > struct ib_recv_wr { > >+struct ib_send_wr wr; > > struct ib_recv_wr *next; > > u64 wr_id; > > struct ib_

Re: [RFC] split struct ib_send_wr

2015-08-04 Thread Christoph Hellwig
On Tue, Aug 04, 2015 at 04:07:42PM +, Hefty, Sean wrote: > This looks like a reasonable start. It may help with feedback if you > could just post the changes to ib_verbs.h. Not sure it's all that useful, but here we go: diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 09

[RFC] split struct ib_send_wr

2015-08-04 Thread Christoph Hellwig
Hi all, please take a look at my RFC patch here: http://git.infradead.org/users/hch/scsi.git/commitdiff/751774250b71da83a26ba8584cff70f5e7bb7b1e the commit contains my explanation, but apparently the patch is too large for the list limit and didn't make it through. -- To unsubscribe fro

Re: [PATCH v2 09/12] IB/srp: Do not create an all physical insecure rkey by default

2015-08-03 Thread Christoph Hellwig
In addition to the comments on the cover letter I think your changes to srp_add_one could use this incremental cleanup: diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index a546256..5e2cb53 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infi

Re: [PATCH v4 00/50] Add OPA gen1 driver

2015-08-03 Thread Christoph Hellwig
On Sat, Aug 01, 2015 at 04:34:31PM -0400, Doug Ledford wrote: > Or, I haven't looked at the soft-roce driver (ever). Is it going to > need this library as well? ROCE implements the IB protocol, so a software ROCE driver will need a IB protocol implementation sitting ontop of ethernet frames (v1)

Re: [PATCH v4 17/50] IB/hfi1: add PSM driver control/data path

2015-08-03 Thread Christoph Hellwig
On Sat, Aug 01, 2015 at 04:18:31PM -0400, Doug Ledford wrote: > If you have a legitimate technical reason to NACK this feature, make > your case. I've publicly stated, in response to Al no less, that I > don't see justification for making a team re-engineer a working, private > interface because a

Re: [PATCH v2 00/12] IB: Replace safe uses for ib_get_dma_mr with pd->local_dma_lkey

2015-08-03 Thread Christoph Hellwig
On Fri, Jul 31, 2015 at 03:20:40PM -0700, Bart Van Assche wrote: > On 07/30/2015 04:22 PM, Jason Gunthorpe wrote: > >All patches are compile tested. I've done basic testing up to and including > >the IPoIB patch, the rest required specialized setups I don't have access to, > >but are fairly straigh

Re: [PATCH v2 00/12] IB: Replace safe uses for ib_get_dma_mr with pd->local_dma_lkey

2015-07-31 Thread Christoph Hellwig
Hi Jason, this series look fine to me, although I don't feel comfortable enough to give a Reviewed-by: tag for the RDMA subsystem yet. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger

Re: [PATCH v4 00/50] Add OPA gen1 driver

2015-07-31 Thread Christoph Hellwig
On Fri, Jul 31, 2015 at 02:05:06AM +0300, Or Gerlitz wrote: > So... enough is enough, please put it in a kernel module residing in > the IB core and use it in this driver, to begin with. The fact that > ipath is going to go, makes the cope duplication "only" 2X vs the 3X, > but it's still 2X Agree

Re: [PATCH v4 17/50] IB/hfi1: add PSM driver control/data path

2015-07-31 Thread Christoph Hellwig
On Thu, Jul 30, 2015 at 05:42:16PM -0400, Doug Ledford wrote: > I have no problem with this code. That Al finds the user space ABI for > this driver to be bizarre is neither here nor there to me. Sure, this > file does not exhibit normal file API behavior. Who cares? Everyone who cares about fi

Re: [PATCH WIP 28/43] IB/core: Introduce new fast registration API

2015-07-30 Thread Christoph Hellwig
On Thu, Jul 30, 2015 at 10:36:31AM -0600, Jason Gunthorpe wrote: > > Also, is it interesting to support swiotlb even if we don't have > > any devices that require it (and should we expect one to ever exist)? > > swiotlb is an obvious example, and totally uninteresting to support, > but we must cor

Re: [PATCH for-4.3 00/15] Modify MR allocation API

2015-07-30 Thread Christoph Hellwig
On Thu, Jul 30, 2015 at 10:32:33AM +0300, Sagi Grimberg wrote: > This patch set is detached from my WIP for modifying our > fast registration kernel API. I incorporated some comments > from Jason and Christoph. The current set is a drop-in replacement > of ib_alloc_fast_reg_mr to ib_alloc_mr which

Re: [PATCH v1] xprtrdma: take vendor driver refcount at client

2015-07-29 Thread Christoph Hellwig
Hi Devesh, I don't understand your use of "vendor driver" here. It seems your'e talking about the HCA driver. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-in

Re: removal of ib_reg_phys_mr()

2015-07-28 Thread Christoph Hellwig
H Rupert, we already knew about this. But staging code is not considered to be in-tree for API change purposes so it doesn't matter. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.k

Re: [PATCH WIP 38/43] iser-target: Port to new memory registration API

2015-07-28 Thread Christoph Hellwig
On Tue, Jul 28, 2015 at 04:06:23PM -0400, Chuck Lever wrote: > My opinion is FMR should be separate from the new API. Some have > expressed an interest in combining all kernel registration > mechanisms under a single API, but they seem too different from > each other to do that successfully. Hi Ch

Re: [PATCH v3 06/15] xprtrdma: Clean up rpcrdma_ia_open()

2015-07-26 Thread Christoph Hellwig
On Sun, Jul 26, 2015 at 02:21:23PM -0400, Chuck Lever wrote: > No, this patch is not strictly needed in 4.3, but my read of > Jason?s series is that he does not touch xprtrdma. I don?t > believe there will be a merge conflict. > > The goal of this patch is to move xprtrdma forward so it will > be

Re: [PATCH v3 06/15] xprtrdma: Clean up rpcrdma_ia_open()

2015-07-26 Thread Christoph Hellwig
Jason has patches that provide a local_dma_lkey in the PD that is always available. Do you need this clean up for the next merge window? If not it might be worth to postponed it to avoid merge conflicts, specially as I assume the NFS changes will go in through Trond. On Mon, Jul 20, 2015 at 03:0

Re: [PATCH v3 04/15] xprtrdma: Don't fall back to PHYSICAL memory registration

2015-07-26 Thread Christoph Hellwig
NFS/RDMA mounts. Looks good, Reviewed-by: Christoph Hellwig -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 01/15] xprtrdma: Make xprt_setup_rdma() agnostic to family of server address

2015-07-26 Thread Christoph Hellwig
On Mon, Jul 20, 2015 at 03:02:33PM -0400, Chuck Lever wrote: > In particular, recognize when an IPv6 connection is bound. > > Signed-off-by: Chuck Lever > Tested-by: Devesh Sharma Looks good, Reviewed-by: Christoph Hellwig -- To unsubscribe from this list: send the line "

Re: [PATCH V6 6/9] isert: Rename IO functions to more descriptive names

2015-07-26 Thread Christoph Hellwig
On Sun, Jul 26, 2015 at 02:00:51PM +0300, Sagi Grimberg wrote: > On the wire iser sends a single rkey, but the target is allowed to > transfer the data however it wants to. So you're trying to get above the limit of a single RDMA READ, not above the limit for memory registration in the initiator?

Re: [PATCH V6 6/9] isert: Rename IO functions to more descriptive names

2015-07-26 Thread Christoph Hellwig
On Sun, Jul 26, 2015 at 01:08:16PM +0300, Sagi Grimberg wrote: > I've given this some thought and I think we should avoid splitting > logic from PI and iWARP. The reason (other than code duplication) is > that currently the iser target support only up to 1MB IOs. I have some > code (not done yet) t

Re: [PATCH V6 4/9] svcrdma: Use max_sge_rd for destination read depths

2015-07-26 Thread Christoph Hellwig
On Sun, Jul 26, 2015 at 12:58:59PM +0300, Sagi Grimberg wrote: > >With the above patch change, we have no more users of the recently created > >rdma_cap_read_multi_sge(). Should I add a patch to remove it? > > Yes please. And in the long run this is another argument for killing the system-wide

Re: [PATCH 00/10] IB: Replace safe uses for ib_get_dma_mr with pd->local_dma_lkey

2015-07-24 Thread Christoph Hellwig
Hi Bart, On Thu, Jul 23, 2015 at 06:47:26AM -0700, Bart Van Assche wrote: > This statement might need some clarification. Are you aware that this memory > region is only used if the kernel module parameter register_always is zero ? The way I read the driver it's also used if the driver doesn't su

Re: [PATCH WIP 01/43] IB: Modify ib_create_mr API

2015-07-23 Thread Christoph Hellwig
On Thu, Jul 23, 2015 at 12:57:34AM +, Hefty, Sean wrote: > > +enum ib_mr_type { > > + IB_MR_TYPE_FAST_REG, > > + IB_MR_TYPE_SIGNATURE, > > If we're going to go through the trouble of changing everything, I vote > for dropping the word 'fast'. It's a marketing term. It's goofy. And > the

Re: [PATCH WIP 00/43] New fast registration API

2015-07-23 Thread Christoph Hellwig
On Wed, Jul 22, 2015 at 08:42:32PM +0300, Sagi Grimberg wrote: > We can do that, but I'd prefer not to pollute the API just for this > single use case. What we can do, is add a pool API that would take care > of that. But even then we might end up with different strategies as not > all ULPs can use

Re: [PATCH WIP 00/43] New fast registration API

2015-07-23 Thread Christoph Hellwig
On Wed, Jul 22, 2015 at 11:27:02AM -0600, Jason Gunthorpe wrote: > What is SRP trying to accomplish with that? > > The only reason that springs to mind is to emulate IB_MR_MAP_ARB_SG ? It's not emulating IB_MR_MAP_ARB_SG, it simply allows muliple memory registrations per I/O request. Be that to

Re: [PATCH WIP 40/43] mlx5: Allocate private context for arbitrary scatterlist registration

2015-07-23 Thread Christoph Hellwig
On Wed, Jul 22, 2015 at 11:30:48AM -0600, Jason Gunthorpe wrote: > On Wed, Jul 22, 2015 at 09:55:40AM +0300, Sagi Grimberg wrote: > > + size += max_t(int, MLX5_UMR_ALIGN - ARCH_KMALLOC_MINALIGN, 0); > > + mr->klms = kzalloc(size, GFP_KERNEL); > > + if (!mr->klms) > > + return -ENOME

Re: [PATCH WIP 38/43] iser-target: Port to new memory registration API

2015-07-23 Thread Christoph Hellwig
> If you want to micro optimize then just zero the few items that are > defined to be accessed for fastreg, no need to zero the whole > structure. Infact, you may have already done that, so just drop the > memset entirely. Oh, indeed. > If you want to optimize this path, then Sean is right, move

Re: [PATCH WIP 28/43] IB/core: Introduce new fast registration API

2015-07-23 Thread Christoph Hellwig
On Wed, Jul 22, 2015 at 11:44:01AM -0600, Jason Gunthorpe wrote: > I was hoping we'd move the DMA flush and translate into here and make > it mandatory. Is there any reason not to do that? That would be a reason for passing in a direction, but it would also up the question on what form we pass tha

Re: [PATCH WIP 00/43] New fast registration API

2015-07-22 Thread Christoph Hellwig
Thanks Sagi, this looks pretty good in general, various nitpicks nonwithstanding. The one thing I'm curious about is how we can support SRP with it's multiple MR support without too much boilerplate code. One option would be that pass an array of MRs to the map routines, and while most callers w

Re: [PATCH WIP 39/43] IB/core: Add arbitrary sg_list support

2015-07-22 Thread Christoph Hellwig
> + IB_DEVICE_MAP_ARB_SG= (1ULL<<32), > +enum ib_mr_flags { > + IB_MR_MAP_ARB_SG = 1, > +}; > + s/ARB_SG/SG_GAPS/? Also please try to document new flags. I know the IB code currently doesn't do it, but starting a trend there would be very useful. -- To unsubscribe from this

Re: [PATCH WIP 38/43] iser-target: Port to new memory registration API

2015-07-22 Thread Christoph Hellwig
> @@ -2585,11 +2517,9 @@ isert_fast_reg_mr(struct isert_conn *isert_conn, > struct isert_device *device = isert_conn->device; > struct ib_device *ib_dev = device->ib_device; > struct ib_mr *mr; > struct ib_send_wr fr_wr, inv_wr; > struct ib_send_wr *bad_wr, *wr = NULL;

Re: [PATCH WIP 37/43] xprtrdma: Port to new memory registration API

2015-07-22 Thread Christoph Hellwig
On Wed, Jul 22, 2015 at 11:03:49AM -0400, Chuck Lever wrote: > I like this (and the matching ib_dma_unmap_sg). But why wouldn?t > this function be called ib_dma_map_sg() ? The name ib_map_mr_sg() > had me thinking for a moment that this API actually posted the > FASTREG WR, but I see that it doesn?

Re: [PATCH WIP 28/43] IB/core: Introduce new fast registration API

2015-07-22 Thread Christoph Hellwig
> +/** > + * ib_map_mr_sg() - Populates MR with a dma mapped SG list > + * @mr:memory region > + * @sg:dma mapped scatterlist > + * @sg_nents: number of entries in sg > + * @access:access permissions I know moving the access flags here was my idea originally, b

Re: [PATCH WIP 21/43] mlx5: Allocate a private page list in ib_alloc_mr

2015-07-22 Thread Christoph Hellwig
Just curious: what's the tradeoff between allocating the page list in the core vs duplicating it in all the drivers? Does the driver variant give us any benefits? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More maj

Re: [PATCH WIP 01/43] IB: Modify ib_create_mr API

2015-07-22 Thread Christoph Hellwig
On Wed, Jul 22, 2015 at 10:34:05AM -0600, Jason Gunthorpe wrote: > > +/** > > + * ib_alloc_mr() - Allocates a memory region > > + * @pd:protection domain associated with the region > > + * @mr_type: memory region type > > + * @max_entries: maximum registration entries available

Re: [PULL REQUEST] Please pull rdma.git

2015-07-16 Thread Christoph Hellwig
On Thu, Jul 16, 2015 at 01:59:37PM +, Suri Shelvapille wrote: > Thanks Hal, we indeed do. And we would like to continue to do so, be it by > setting a flag or otherwise. I hope removing that functionality from the > kernel is not an option, as it would be a major inconvenience for us. Hi Sur

Re: [PULL REQUEST] Please pull rdma.git

2015-07-16 Thread Christoph Hellwig
Hi Doug, the point is that there is no driver that even set the is_switch flag in-tree and I can't find a publicly available one elsewhere either. So it's plain and simple dead code. I'll send a patch to kill this untestable code off once I find a little spare time. -- To unsubscribe from this l

Re: Kernel fast memory registration API proposal [RFC]

2015-07-16 Thread Christoph Hellwig
On Thu, Jul 16, 2015 at 09:52:44AM +0300, Sagi Grimberg wrote: > >>I suggest to start with what I proposed. And in a later stage (if we > >>still think its needed) we can have a higher level API that hides the > >>post, something like: > > > >>rdma_reg_sg(struct ib_qp *qp, > >>struct ib

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-16 Thread &#x27;Christoph Hellwig'
On Wed, Jul 15, 2015 at 01:12:57PM -0600, Jason Gunthorpe wrote: > > This looks perfect to me. After this we can get rid of the > > ib_get_dma_mr calls outside of ib_alloc_pd, and eventuall move > > setting up ->local_dma_lkey into the HW driver and kill of > > ib_get_dma_mr, IB_DEVICE_LOCAL_DMA_L

Re: Kernel fast memory registration API proposal [RFC]

2015-07-16 Thread Christoph Hellwig
On Wed, Jul 15, 2015 at 12:31:29PM -0600, Jason Gunthorpe wrote: > This sounds very workable? Christoph? This is close to what I had initially envisioned, but with all the discussions here I'd rather stat out with something simpler. E.g. Sagi's proposal with a few refinements. One we have all th

Re: [PULL REQUEST] Please pull rdma.git

2015-07-16 Thread Christoph Hellwig
On Wed, Jul 15, 2015 at 05:07:46PM -0700, Linus Torvalds wrote: > Hmm. I've pulled this, but quite frankly, I don't think this was > appropriate for post-merge-window. It's not at all obvious that that > "rdma_cap_ib_switch helper" thing is a bugfix, and that goes for a few > of the other commits t

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-15 Thread &#x27;Christoph Hellwig'
On Wed, Jul 15, 2015 at 11:47:52AM +0300, Sagi Grimberg wrote: > > struct ib_pd *ib_alloc_pd(struct ib_device *device) > > { > > struct ib_pd *pd; > >+struct ib_device_attr devattr; > >+int rc; > >+ > >+rc = ib_query_device(device, &devattr); > >+if (rc) > >+return

[TECH TOPIC] IRQ affinity

2015-07-15 Thread Christoph Hellwig
Many years ago we decided to move setting of IRQ to core affnities to userspace with the irqbalance daemon. These days we have systems with lots of MSI-X vector, and we have hardware and subsystem support for per-CPU I/O queues in the block layer, the RDMA subsystem and probably the network stack

Re: Kernel fast memory registration API proposal [RFC]

2015-07-15 Thread Christoph Hellwig
On Wed, Jul 15, 2015 at 11:33:39AM +0300, Sagi Grimberg wrote: > Umm, I think this can become weird given all other primitives have > ib_ prefix. I'd prefer to keep that prefix to stay consistent, and have > an incremental change to do it for all the primitives (structs & verbs). Fine with me, we'

Re: Kernel fast memory registration API proposal [RFC]

2015-07-15 Thread Christoph Hellwig
Hi Sagi, I went over your proposal based on reviewing the ongoing MR threads and my implementation of a similar in-driver abstraction, so here are some proposed updates. > struct provider_mr { > u64 *page_list; // or what ever the HW uses > ... ... > struct ib_mr

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread &#x27;Christoph Hellwig'
ocal_dma_lkey to do anything with > a QP, so lets us ensure one exists for every PD created. > > If the driver can supply a global local_dma_lkey then use that, otherwise > ask the driver to create a local use all physical memory MR associated > with the new PD. > > Signed-o

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread &#x27;Christoph Hellwig'
On Tue, Jul 14, 2015 at 02:32:31PM -0500, Steve Wise wrote: > You mean "should not", yea? > > Ok. I'll check for iWARP. But don't tell me to remove the > transport-specific hacks in this series when I post it! ;) Just curious if there are any holes in this little scheme to deal with the lkey

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread &#x27;Christoph Hellwig'
On Tue, Jul 14, 2015 at 02:25:50PM -0500, Steve Wise wrote: > if (device_supports_fastreg && device_supports_signature) > use FRMR > else > use DMAMR > > Shouldn't we just recode it this way? > > if (device_supports_fastreg) > use FRMR > else > use DMAMR How does

Re: [PATCH V5 5/5] RDMA/isert: Limit read depth based on the device max_sge_rd capability

2015-07-14 Thread &#x27;Christoph Hellwig'
On Tue, Jul 14, 2015 at 09:41:00AM -0500, Steve Wise wrote: > > Btw, any hance to make the NFS client use these values as well instead > > of the current rdma_read_max_sge() hack? > > Chuck, can you add this to your cleanup list? It would be useful to add this to your series so we can get rid of

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread &#x27;Christoph Hellwig'
On Tue, Jul 14, 2015 at 12:22:14PM +0300, Sagi Grimberg wrote: > It's better if you want it fast. I can't stress it enough, but IMO, the > fallback should *not* be in the API, but rather in the ULP. > Ideally, at some point it won't need to fall back, and we can remove > the API. But if all driver

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread &#x27;Christoph Hellwig'
On Tue, Jul 14, 2015 at 12:10:36PM +0300, Sagi Grimberg wrote: > Having an API that does FRMR/FMR/PHYS_MR is even worse from the ULP > PoV. If you expose an API that might schedule (PHYS_MR) it limits the > context that the caller is allowed to call in. > > I'm 100% against an registration API tha

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread &#x27;Christoph Hellwig'
On Tue, Jul 14, 2015 at 12:05:53PM +0300, Sagi Grimberg wrote: > iser has it too. I have a similar patch with a flag for iser (its > behind a bulk of patches that are still pending though). So instead of this flag can we revisit the need for it? Given how inherently isecture it is maybe a "alloc_

Re: Kernel fast memory registration API proposal [RFC]

2015-07-14 Thread Christoph Hellwig
On Tue, Jul 14, 2015 at 11:39:24AM +0300, Sagi Grimberg wrote: > This is exactly what I don't want to do. I don't think that implicit > posting is a good idea for reasons that I mentioned earlier: > > "This is where I have a problem. Providing an API that may or may not > post a work request on my

Re: [PATCH V5 5/5] RDMA/isert: Limit read depth based on the device max_sge_rd capability

2015-07-14 Thread Christoph Hellwig
On Sun, Jul 05, 2015 at 12:45:06PM -0500, Steve Wise wrote: > Use the device's max_sge_rd capability to compute the target's read sge > depth. Save both the read and write max_sge values in the isert_conn > struct, and use these when creating RDMA_WRITE/READ work requests. Btw, any hance to make

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread &#x27;Christoph Hellwig'
On Mon, Jul 13, 2015 at 03:36:44PM -0400, Tom Talpey wrote: > On 7/11/2015 6:25 AM, 'Christoph Hellwig' wrote: > >I think what we need to support for now is FRMR as the primary target, > >and FMR as a secondar[y]. > > FMR is a *very* bad choice, for several reasons

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-14 Thread &#x27;Christoph Hellwig'
On Mon, Jul 13, 2015 at 10:57:48AM -0600, Jason Gunthorpe wrote: > > Currently various drivers are using ib_get_dma_mr with remote flags > > unfortunately, e.g. the SRP initiator driver uses it to optimize away > > memory registrtions for single SGL entry requests. > > Unconditionally? Ugh. Maybe

Re: Kernel fast memory registration API proposal [RFC]

2015-07-12 Thread Christoph Hellwig
On Sun, Jul 12, 2015 at 02:15:56PM -0400, Chuck Lever wrote: > > Chuck, Would a scatterlist API make life easier for you? > > No benefit for me. > > The NFS upper layer already slices and dices I/O until it is a > stream of contiguous single I/O requests for the server. > > It passes down a vect

Re: [PATCH v1 04/12] xprtrdma: Remove last ib_reg_phys_mr() call site

2015-07-12 Thread Christoph Hellwig
On Sat, Jul 11, 2015 at 02:50:58PM -0400, Chuck Lever wrote: > I would prefer to run this by Doug Oucharek first, as a courtesy. > Unless I???m mistaken, the Lustre client is supposed to be on its > way into the kernel, not on its way out. The only way a mess like lustre got into the kernel tree a

Re: Kernel fast memory registration API proposal [RFC]

2015-07-11 Thread Christoph Hellwig
On Fri, Jul 10, 2015 at 12:09:37PM +0300, Sagi Grimberg wrote: > And then provide helpers to populate the MR with generic kernel > structures such as struct scatterlist (for scsi and other ULPs), > struct page (for NFS) or struct bio_vec (for block ULPs later on). Please stick to struct scatterlis

Re: [PATCH v1 04/12] xprtrdma: Remove last ib_reg_phys_mr() call site

2015-07-11 Thread Christoph Hellwig
On Thu, Jul 09, 2015 at 04:42:18PM -0400, Chuck Lever wrote: > All HCA providers have an ib_get_dma_mr() verb. Thus > rpcrdma_ia_open() will either grab the device's local_dma_key if one > is available, or it will call ib_get_dma_mr() which is a 100% > guaranteed fallback. There is never any need t

Re: kernel memory registration (was: RDMA/core: Transport-independent access flags)

2015-07-11 Thread &#x27;Christoph Hellwig'
On Fri, Jul 10, 2015 at 11:55:29AM +0300, Sagi Grimberg wrote: > If there is one thing worse than a complicated API, it is a restrictive > one. I'd much rather ULPs just having a simple API for registering > memory. Quite to the contrary. The complex API almost asks for weird abuses and twists.

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-11 Thread &#x27;Christoph Hellwig'
On Thu, Jul 09, 2015 at 11:01:42AM -0600, Jason Gunthorpe wrote: > To your point in another message, I'd say, as long as the new API > supports FRMR at full speed with no performance penalty we are > good. If the other variants out there take a performance hit, then I > think that is OK. As you say

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-11 Thread &#x27;Christoph Hellwig'
On Fri, Jul 10, 2015 at 01:54:20PM -0600, Jason Gunthorpe wrote: > diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c > index bac3fb406a74..6ed7e0f6c162 100644 > --- a/drivers/infiniband/core/verbs.c > +++ b/drivers/infiniband/core/verbs.c > @@ -1126,6 +1126,12 @@ struct

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-10 Thread Christoph Hellwig
On Thu, Jul 09, 2015 at 09:52:59AM -0400, Chuck Lever wrote: > There is one remaining kernel user of ib_reg_phys_mr() in 4.2: Lustre. It's in the staging tree, which proper in-tree code doesn't have to cater for. So as soon as sunrpc is done using the interface we can and should kill it off. -- T

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-09 Thread &#x27;Christoph Hellwig'
On Wed, Jul 08, 2015 at 06:03:37PM -0600, Jason Gunthorpe wrote: > The major trouble with that is that the new MR types work by posting > work to the send queue, that work creates the MR. > > I don't know all the details of how those schemes work, but it doesn't > look like it fits into this model

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread &#x27;Christoph Hellwig'
On Wed, Jul 08, 2015 at 01:32:05PM -0700, 'Christoph Hellwig' wrote: > /* updates *sg if the SG couldn't be fully registered due to offsets */ > int rdma_register_sg(struct rdma_mr *mr, struct scatterlist **sg, > u32 *pkey, u32 *offset, u32 *len); plus an

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread &#x27;Christoph Hellwig'
On Wed, Jul 08, 2015 at 01:08:42PM -0600, Jason Gunthorpe wrote: > Then, what is left is all remote MRs and maybe it will be clearer what > to do about them then... >From looking at that for a while the APIs needed seem pretty simple to me from a consumer perspective: struct rdma_mr *rmda_alloc_m

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Christoph Hellwig
On Wed, Jul 08, 2015 at 04:15:00PM -0400, Doug Ledford wrote: > On 07/08/2015 04:02 PM, Christoph Hellwig wrote: > > So how about someone tells OFED to stop trying to enforce this BS? > > Unfortunately, simply "not enforcing" a bylaw of a multi-company > organization

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Christoph Hellwig
So how about someone tells OFED to stop trying to enforce this BS? This just confirms my byass that Open-Fabrics Alliance are a bunch of idiots making life hard, similar to all their horrible OFED driver distributions that crated a total mess for everyone involved. -- To unsubscribe from this list

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Christoph Hellwig
On Wed, Jul 08, 2015 at 03:33:03PM -0400, Doug Ledford wrote: > I am not a lawyer, but this has been explained to me on numerous > occasions, so I relay the layman's interpretation here: > > No, you don't always need everyone's approval. There are contributions > that are not legally copyright wo

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread &#x27;Christoph Hellwig'
On Wed, Jul 08, 2015 at 01:05:28PM +0300, Sagi Grimberg wrote: > If we agree to consolidate on a single MR allocation API, I don't see > how this wrapper is moving us forward. But if you guys prefer to have it > than I don't have a hard objection. Well, when are we going to get that MR allocation

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread &#x27;Christoph Hellwig'
On Wed, Jul 08, 2015 at 10:29:56AM +0300, Sagi Grimberg wrote: > I don't necessarily agree. The API we'd want is a single API at all > the call sites to all types of MRs. We have different QP types, and > still we don't have an allocation API for each and every one. > I honestly don't see why we ha

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-08 Thread &#x27;Christoph Hellwig'
On Tue, Jul 07, 2015 at 09:05:15AM -0500, Steve Wise wrote: > I took the feedback from Christoph and Jason to mean I should remove > ib_get_dma_mr() entirely and pull its guts into > rdma_get_dma_mr(), and change all the users of ib_get_dma_mr() to use > rdma_get_dma_mr(). So the net result isn'

Re: [PATCH 0/2] update ocrdma to dual license

2015-07-08 Thread Christoph Hellwig
On Wed, Jul 08, 2015 at 12:26:56PM +0530, Devesh Sharma wrote: > We (Emulex/Avago) were lobbied by the Open-Fabrics Alliance (OFA) to > change the licensing from just GPLv2 to a dual GPLv2/BSD license. > They would prefer the elements in the OFED stack all be dual licensed. > We're trying to move

Re: [PATCH V3 0/5] Transport-independent MRs

2015-07-07 Thread &#x27;Christoph Hellwig'
On Mon, Jul 06, 2015 at 09:24:54AM -0500, Steve Wise wrote: > I can. These are the only "transport independent" ULPs at this point. And the only reason for that is that the current APIs are such a nighmare and require extra effort to be transport independent. By removing APIs that encourage this

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-07 Thread Christoph Hellwig
On Mon, Jul 06, 2015 at 07:17:38PM +0300, Sagi Grimberg wrote: > >Ok. I'll remove all uses of ib_get_dma_mr()... > > > > > > I meant that rdma_get_dma_mr can go away. I'd prefer to get the > needed access_flags and just call existing verb. I strongly disagree. As this series has shown the existi

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-07 Thread &#x27;Christoph Hellwig'
On Mon, Jul 06, 2015 at 09:23:42AM -0500, Steve Wise wrote: > > Please add an assert for the values that are allowed for attrs. > > > > It also would be highly useful to add a kerneldoc comment describing > > the function and the parameters. Also __bitwise sparse tricks > > to ensure the right fl

Re: [PATCH V3 0/5] Transport-independent MRs

2015-07-05 Thread Christoph Hellwig
On Sun, Jul 05, 2015 at 06:21:53PM -0500, Steve Wise wrote: > This series introduces transport-independent RDMA core services for > allocating DMA MRs and computing fast register access flags. Included are > changes to the iSER and NFSRDMA ULPs to make use of the new services. Can you convert all

Re: [PATCH V3 1/5] RDMA/core: Transport-independent access flags

2015-07-05 Thread Christoph Hellwig
On Sun, Jul 05, 2015 at 06:22:00PM -0500, Steve Wise wrote: > The semantics for MR access flags are not consistent across RDMA > protocols. So rather than have applications try and glean what they > need, have them pass in the intended roles and attributes for the MR to > be allocated and let the

<    1   2   3   4   5   >