RE: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-07-02 Thread Steve Wise


> -Original Message-
> From: linux-rdma-ow...@vger.kernel.org 
> [mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Sagi Grimberg
> Sent: Thursday, July 02, 2015 8:23 AM
> To: Steve Wise; Hefty, Sean; dledf...@redhat.com
> Cc: r...@mellanox.com; sa...@mellanox.com; linux-rdma@vger.kernel.org; 
> jguntho...@obsidianresearch.com; infinipath;
> e...@mellanox.com; ogerl...@mellanox.com
> Subject: Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags
> 
> On 7/2/2015 4:17 PM, Steve Wise wrote:
> > On 7/2/2015 1:22 AM, Sagi Grimberg wrote:
> >> On 6/30/2015 8:10 PM, Hefty, Sean wrote:
> >>>> I suggest to start consolidating to ib_create_mr() that receives an
> >>>> extensible ib_mr_init_attr and additional attributes can be mr_roles
> >>>> and mr_attrs.
> >>>
> >>> I think this makes sense, but does it really help?
> >>> If the end result is that the app and providers basically
> >>> end up switching on mr_attr::type, we end up reducing the
> >>> number of APIs, but the code complexity remains the same.
> >>>
> >>
> >> I think it will reduce code complexity. First, the callers
> >> will have a single API to allocate a memory key. I'm not sure why you
> >> say that app needs to switch on mr::type, it knows which type it wants.
> >>
> >
> > I dont see how doing this is less complex:
> >
> > attr = FASTREG
> > mr = create_mr(attr)
> >
> > vs:
> >
> > mr = lloc_fast_reg_mr()
> 
> The simplification is that you can facilitate changes like your
> mr_roles and any other things we may want to add to mr allocation
> easily instead of suggesting wrappers over wrappers over wrappers.
> 
> This is why I suggested it in the first place.
> 

Regardless of have one create_mr() or having create_sig_mr(), 
create_fast_reg_mr(), get_dma_mr(), we'll still need to define the 
roles/attributes to make them transport independent, and we'll still need to 
have an entry point that will compute access_flags given roles and attributes 
to be used when posting frmrs...  I don't see wrappers over wrappers over 
wrappers.

I'm not really opposed to creating a single entry point, but rather I'm not 
really seeing vast utility of doing so.  

Anybody else have an opinion?

Steve.




--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-07-02 Thread Sagi Grimberg

On 7/2/2015 4:17 PM, Steve Wise wrote:

On 7/2/2015 1:22 AM, Sagi Grimberg wrote:

On 6/30/2015 8:10 PM, Hefty, Sean wrote:

I suggest to start consolidating to ib_create_mr() that receives an
extensible ib_mr_init_attr and additional attributes can be mr_roles
and mr_attrs.


I think this makes sense, but does it really help?
If the end result is that the app and providers basically
end up switching on mr_attr::type, we end up reducing the
number of APIs, but the code complexity remains the same.



I think it will reduce code complexity. First, the callers
will have a single API to allocate a memory key. I'm not sure why you
say that app needs to switch on mr::type, it knows which type it wants.



I dont see how doing this is less complex:

attr = FASTREG
mr = create_mr(attr)

vs:

mr = lloc_fast_reg_mr()


The simplification is that you can facilitate changes like your
mr_roles and any other things we may want to add to mr allocation
easily instead of suggesting wrappers over wrappers over wrappers.

This is why I suggested it in the first place.

Sagi.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-07-02 Thread Steve Wise

On 7/2/2015 1:22 AM, Sagi Grimberg wrote:

On 6/30/2015 8:10 PM, Hefty, Sean wrote:

I suggest to start consolidating to ib_create_mr() that receives an
extensible ib_mr_init_attr and additional attributes can be mr_roles
and mr_attrs.


I think this makes sense, but does it really help?
If the end result is that the app and providers basically
end up switching on mr_attr::type, we end up reducing the
number of APIs, but the code complexity remains the same.



I think it will reduce code complexity. First, the callers
will have a single API to allocate a memory key. I'm not sure why you
say that app needs to switch on mr::type, it knows which type it wants.



I dont see how doing this is less complex:

attr = FASTREG
mr = create_mr(attr)

vs:

mr = lloc_fast_reg_mr()


As for drivers, From what I've seen, usually there is a single memory
key allocation interface to the HW and for each API and drivers call it
with a slightly different attributes and maybe do some extra driver
specific logic.

Yes, probably drivers will need to switch on mr_attr::type, but maybe
just just to determine the correct HW interface and hopefully not to
a completely different logic. I think that drivers can benefit in
reduced code duplication in total.



There isn't much code duplication now because as you mentioned above, 
most drivers have common services that get called from the various entry 
points: alloc_fast_reg_mr(), get_dma_mr(), reg_user_mr(), etc...



As a first step, we can do a naive switch on mr_attr::type in the
drivers but I'd expect driver developers to change this logic in their
driver.

Thoughts?


I don't see any reduction in complexity of the application nor 
driver/core code.  Perhaps a reduction in API complexity by having a 
single entry point...but that entry point would be the place I would fan 
out and call the driver functions vs changing the core<->driver interface...


My 2 centimes...


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-07-01 Thread Sagi Grimberg

On 6/30/2015 8:10 PM, Hefty, Sean wrote:

I suggest to start consolidating to ib_create_mr() that receives an
extensible ib_mr_init_attr and additional attributes can be mr_roles
and mr_attrs.


I think this makes sense, but does it really help?
If the end result is that the app and providers basically
end up switching on mr_attr::type, we end up reducing the
number of APIs, but the code complexity remains the same.



I think it will reduce code complexity. First, the callers
will have a single API to allocate a memory key. I'm not sure why you
say that app needs to switch on mr::type, it knows which type it wants.

As for drivers, From what I've seen, usually there is a single memory
key allocation interface to the HW and for each API and drivers call it
with a slightly different attributes and maybe do some extra driver
specific logic.

Yes, probably drivers will need to switch on mr_attr::type, but maybe
just just to determine the correct HW interface and hopefully not to
a completely different logic. I think that drivers can benefit in
reduced code duplication in total.

As a first step, we can do a naive switch on mr_attr::type in the
drivers but I'd expect driver developers to change this logic in their
driver.

Thoughts?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-07-01 Thread Or Gerlitz

On 6/30/2015 7:42 PM, Jason Gunthorpe wrote:

NFSRDMA currently checks the transport type to decide how to set the
>access flags for memory registration.  With the new services
>exported in this series, we can change/simplify NFSRDMA to not have
>to know the transport type.

It would be excellent if this series actually went through and got rid
of all the now deprecated users. This would confirm we have the right
API here and prune off the old stuff. This is fairly trivial to do, I think?

The goal is to make things simpler, maintaining two kernel APIs is not
simpler:)


Agree.

Steve, you should 1st go and port NFSoRDMA to your proposal (patch) for 
IB core changes, so we can see the change in action for a code which is 
working today on multiple transports, iser should be 2nd.


Or.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Steve Wise

> -Original Message-
> From: Jason Gunthorpe [mailto:jguntho...@obsidianresearch.com]
> Sent: Tuesday, June 30, 2015 11:21 AM
> To: Steve Wise
> Cc: dledf...@redhat.com; r...@mellanox.com; sa...@mellanox.com; 
> linux-rdma@vger.kernel.org; infinip...@intel.com;
> e...@mellanox.com; ogerl...@mellanox.com; sean.he...@intel.com
> Subject: Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags
> 
> On Mon, Jun 29, 2015 at 04:36:18PM -0500, Steve Wise wrote:
> > +int rdma_device_access_flags(struct ib_pd *pd, int roles, int attrs)
> > +{
> > +   int access_flags = attrs;
> 
> No RDMA_MRR_SEND ?
>

Send need no explicit access flags.  There is no LOCAL_READ access flag.
 
> > +   if (roles & RDMA_MRR_RECV)
> > +   access_flags |= IB_ACCESS_LOCAL_WRITE;
> > +
> > +   if (roles & RDMA_MRR_WRITE_DEST)
> > +   access_flags |= IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_WRITE;
> 
> Is IB_ACCESS_LOCAL_WRITE needed?
> 

Yes.  You must have LOCAL_WRITE if you set REMOTE_WRITE.

> > +   if (roles & RDMA_MRR_READ_DEST) {
> > +   access_flags |= IB_ACCESS_LOCAL_WRITE;
> > +   if (rdma_protocol_iwarp(pd->device,
> > +   rdma_start_port(pd->device)))
> > +   access_flags |= IB_ACCESS_REMOTE_WRITE;
> > +   }
> 
> So on iWarp if I want to issue a RDMA_READ then I have to allow the
> far side uncontrolled write access to the same memory? Is there
> something else protecting it?
>

Only the fact that the rkey of your MR setup for MRR_READ_DEST doesn't have to 
be advertised to the peer application.  So the peer
app doesn't know what the rkey is.   But the fact is that an MR with 
REMOTE_WRITE access flags can be used as the destination of
RDMA reads and RDMA writes for iWARP. 

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Hefty, Sean
> I suggest to start consolidating to ib_create_mr() that receives an
> extensible ib_mr_init_attr and additional attributes can be mr_roles
> and mr_attrs.

I think this makes sense, but does it really help?  If the end result is that 
the app and providers basically end up switching on mr_attr::type, we end up 
reducing the number of APIs, but the code complexity remains the same.


Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Jason Gunthorpe
> NFSRDMA currently checks the transport type to decide how to set the
> access flags for memory registration.  With the new services
> exported in this series, we can change/simplify NFSRDMA to not have
> to know the transport type.

It would be excellent if this series actually went through and got rid
of all the now deprecated users. This would confirm we have the right
API here and prune off the old stuff. This is fairly trivial to do, I
think?

The goal is to make things simpler, maintaining two kernel APIs is not
simpler :)

Regarding Sagi's comments .. We don't have to do everything at once,
a series that focuses only on the access flags seems sane to
me.

Sagi's idea makes a lot of sense, but maybe it should be explored
along the direction Christoph has been talking about?

I suggested the wrapper because it makes it very easy to force the old
API out (just remove the old function call), but maybe we could move
the old style access flags into a private header or something ?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Jason Gunthorpe
On Mon, Jun 29, 2015 at 04:36:18PM -0500, Steve Wise wrote:
> +int rdma_device_access_flags(struct ib_pd *pd, int roles, int attrs)
> +{
> + int access_flags = attrs;

No RDMA_MRR_SEND ?

> + if (roles & RDMA_MRR_RECV)
> + access_flags |= IB_ACCESS_LOCAL_WRITE;
> +
> + if (roles & RDMA_MRR_WRITE_DEST)
> + access_flags |= IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_WRITE;

Is IB_ACCESS_LOCAL_WRITE needed?

> + if (roles & RDMA_MRR_READ_DEST) {
> + access_flags |= IB_ACCESS_LOCAL_WRITE;
> + if (rdma_protocol_iwarp(pd->device,
> + rdma_start_port(pd->device)))
> + access_flags |= IB_ACCESS_REMOTE_WRITE;
> + }

So on iWarp if I want to issue a RDMA_READ then I have to allow the
far side uncontrolled write access to the same memory? Is there
something else protecting it?

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Steve Wise


> -Original Message-
> From: linux-rdma-ow...@vger.kernel.org 
> [mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Chuck Lever
> Sent: Tuesday, June 30, 2015 9:42 AM
> To: Steve Wise
> Cc: Or Gerlitz; dledf...@redhat.com; r...@mellanox.com; sa...@mellanox.com; 
> linux-rdma@vger.kernel.org;
> jguntho...@obsidianresearch.com; infinip...@intel.com; e...@mellanox.com; 
> sean.he...@intel.com
> Subject: Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags
> 
> 
> On Jun 30, 2015, at 10:29 AM, Steve Wise  wrote:
> 
> >
> >
> >> -Original Message-
> >> From: Or Gerlitz [mailto:ogerl...@mellanox.com]
> >> Sent: Tuesday, June 30, 2015 4:04 AM
> >> To: Steve Wise; Chuck Lever
> >> Cc: dledf...@redhat.com; r...@mellanox.com; sa...@mellanox.com; 
> >> linux-rdma@vger.kernel.org; jguntho...@obsidianresearch.com;
> >> infinip...@intel.com; e...@mellanox.com; sean.he...@intel.com
> >> Subject: Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags
> >>
> >> On 6/30/2015 12:36 AM, Steve Wise wrote:
> >>> The semantics for MR access flags are not consistent across RDMA
> >>> protocols.  So rather than have applications try and glean what they
> >>> need, have them pass in the intended roles and attributes for the MR to
> >>> be allocated and let the RDMA core select the appropriate access flags
> >>> given the roles, attributes, and device capabilities.
> >>
> >> wait, we have NFSoRDMA (net/sunrpc/xprtrdma/*) in the kernel for years
> >> and it works on top of both IB/RoCE and iWARP.I know they have there 5-6
> >> memory registration methods.. but if we stick to their mode that uses
> >> fast registration ala your upstream commit 0f7ec3 "RDMA/core: Add memory
> >> management extensions support"  -- what's missing or how come it work
> >> w/o the enhancement suggested here?Added Chuck.
> >
> > NFSRDMA currently checks the transport type to decide how to set the access 
> > flags for memory registration.  With the new
services
> exported in this series, we can change/simplify NFSRDMA to not have to know 
> the transport type.
> 
> I was planning to look at this more closely soon, but if you
> have patches I'd happily consider them, or you can just point
> me to what needs to be changed and I can put it together for 4.3.
>

Thanks.  I suggest you hold off on NFSRDMA changes until we get closure on the 
new core services.  Once it stabilizes, I'll ping ya.
(and I'll CC you on subsequent versions of this).

Stevo


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Chuck Lever

On Jun 30, 2015, at 10:29 AM, Steve Wise  wrote:

> 
> 
>> -Original Message-
>> From: Or Gerlitz [mailto:ogerl...@mellanox.com]
>> Sent: Tuesday, June 30, 2015 4:04 AM
>> To: Steve Wise; Chuck Lever
>> Cc: dledf...@redhat.com; r...@mellanox.com; sa...@mellanox.com; 
>> linux-rdma@vger.kernel.org; jguntho...@obsidianresearch.com;
>> infinip...@intel.com; e...@mellanox.com; sean.he...@intel.com
>> Subject: Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags
>> 
>> On 6/30/2015 12:36 AM, Steve Wise wrote:
>>> The semantics for MR access flags are not consistent across RDMA
>>> protocols.  So rather than have applications try and glean what they
>>> need, have them pass in the intended roles and attributes for the MR to
>>> be allocated and let the RDMA core select the appropriate access flags
>>> given the roles, attributes, and device capabilities.
>> 
>> wait, we have NFSoRDMA (net/sunrpc/xprtrdma/*) in the kernel for years
>> and it works on top of both IB/RoCE and iWARP.I know they have there 5-6
>> memory registration methods.. but if we stick to their mode that uses
>> fast registration ala your upstream commit 0f7ec3 "RDMA/core: Add memory
>> management extensions support"  -- what's missing or how come it work
>> w/o the enhancement suggested here?Added Chuck.
> 
> NFSRDMA currently checks the transport type to decide how to set the access 
> flags for memory registration.  With the new services exported in this 
> series, we can change/simplify NFSRDMA to not have to know the transport 
> type.  

I was planning to look at this more closely soon, but if you
have patches I’d happily consider them, or you can just point
me to what needs to be changed and I can put it together for 4.3.

--
Chuck Lever



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Steve Wise


> -Original Message-
> From: Sagi Grimberg [mailto:sa...@dev.mellanox.co.il]
> Sent: Tuesday, June 30, 2015 4:22 AM
> To: Steve Wise; dledf...@redhat.com
> Cc: r...@mellanox.com; sa...@mellanox.com; linux-rdma@vger.kernel.org; 
> jguntho...@obsidianresearch.com; infinip...@intel.com;
> e...@mellanox.com; ogerl...@mellanox.com; sean.he...@intel.com
> Subject: Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags
> 
> On 6/30/2015 12:36 AM, Steve Wise wrote:
> > The semantics for MR access flags are not consistent across RDMA
> > protocols.  So rather than have applications try and glean what they
> > need, have them pass in the intended roles and attributes for the MR to
> > be allocated and let the RDMA core select the appropriate access flags
> > given the roles, attributes, and device capabilities.
> >
> > We introduce rdma_mr_roles and rdma_mr_attributes that enumerate the
> > possible roles and attributes for a MR.  These are documented in the
> > enums themselves.
> >
> > New services exported:
> >
> > rdma_device_access_flags() - given the intended MR roles and attributes
> > passed in, return the ib_access_flags bits for the device.
> >
> > rdma_get_dma_mr() - allocate a dma mr using the applications intended
> > MR roles and MR attributes.  This uses rdma_device_access_flags().
> >
> > rdma_fast_reg_access_flags() - return the ib_access_flags bits needed
> > for a fast register WR given the applications intended MR roles and
> > MR attributes.  This uses rdma_device_access_flags().
> >
> > Signed-off-by: Steve Wise 
> > ---
> >   drivers/infiniband/core/verbs.c |   30 
> >   include/rdma/ib_verbs.h |  101 
> > +++
> >   2 files changed, 131 insertions(+), 0 deletions(-)
> >
> > diff --git a/drivers/infiniband/core/verbs.c 
> > b/drivers/infiniband/core/verbs.c
> > index bac3fb4..2aa7c92 100644
> > --- a/drivers/infiniband/core/verbs.c
> > +++ b/drivers/infiniband/core/verbs.c
> > @@ -1144,6 +1144,36 @@ struct ib_mr *ib_get_dma_mr(struct ib_pd *pd, int 
> > mr_access_flags)
> >   }
> >   EXPORT_SYMBOL(ib_get_dma_mr);
> >
> > +int rdma_device_access_flags(struct ib_pd *pd, int roles, int attrs)
> > +{
> > +   int access_flags = attrs;
> > +
> > +   if (roles & RDMA_MRR_RECV)
> > +   access_flags |= IB_ACCESS_LOCAL_WRITE;
> > +
> > +   if (roles & RDMA_MRR_WRITE_DEST)
> > +   access_flags |= IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_WRITE;
> > +
> > +   if (roles & RDMA_MRR_READ_DEST) {
> > +   access_flags |= IB_ACCESS_LOCAL_WRITE;
> > +   if (rdma_protocol_iwarp(pd->device,
> > +   rdma_start_port(pd->device)))
> > +   access_flags |= IB_ACCESS_REMOTE_WRITE;
> > +   }
> > +
> > +   if (roles & RDMA_MRR_READ_SOURCE)
> > +   access_flags |= IB_ACCESS_REMOTE_READ;
> > +
> > +   if (roles & RDMA_MRR_ATOMIC)
> > +   access_flags |= IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_ATOMIC;
> > +
> > +   if (roles & RDMA_MRR_MW_BIND)
> > +   access_flags |= IB_ACCESS_MW_BIND;
> > +
> > +   return access_flags;
> > +}
> > +EXPORT_SYMBOL(rdma_device_access_flags);
> > +
> >   struct ib_mr *ib_reg_phys_mr(struct ib_pd *pd,
> >  struct ib_phys_buf *phys_buf_array,
> >  int num_phys_buf,
> > diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
> > index 986fddb..135592d 100644
> > --- a/include/rdma/ib_verbs.h
> > +++ b/include/rdma/ib_verbs.h
> > @@ -2494,6 +2494,107 @@ static inline int ib_req_ncomp_notif(struct ib_cq 
> > *cq, int wc_cnt)
> >   struct ib_mr *ib_get_dma_mr(struct ib_pd *pd, int mr_access_flags);
> >
> >   /**
> > + * rdma_mr_roles - possible roles an RDMA MR will be used for
> > + *
> > + * This allows a transport independent RDMA application to
> > + * create MRs that are usable for all the desired roles w/o
> > + * having to understand which access rights are needed.
> > + */
> > +enum {
> > +
> > +   /* lkey used in a ib_recv_wr sge */
> > +   RDMA_MRR_RECV   = 1,
> > +
> > +   /* lkey used for a IB_WR_SEND in the ib_send_wr sge */
> > +   RDMA_MRR_SEND   = (1<<1),
> > +
> > +   /* rkey used for a IB_WR_RDMA_READ in ib_send_wr wr.rdma.rkey */
> > +   RDMA_MRR_READ_SOURCE

RE: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Steve Wise


> -Original Message-
> From: Or Gerlitz [mailto:ogerl...@mellanox.com]
> Sent: Tuesday, June 30, 2015 4:04 AM
> To: Steve Wise; Chuck Lever
> Cc: dledf...@redhat.com; r...@mellanox.com; sa...@mellanox.com; 
> linux-rdma@vger.kernel.org; jguntho...@obsidianresearch.com;
> infinip...@intel.com; e...@mellanox.com; sean.he...@intel.com
> Subject: Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags
> 
> On 6/30/2015 12:36 AM, Steve Wise wrote:
> > The semantics for MR access flags are not consistent across RDMA
> > protocols.  So rather than have applications try and glean what they
> > need, have them pass in the intended roles and attributes for the MR to
> > be allocated and let the RDMA core select the appropriate access flags
> > given the roles, attributes, and device capabilities.
> 
> wait, we have NFSoRDMA (net/sunrpc/xprtrdma/*) in the kernel for years
> and it works on top of both IB/RoCE and iWARP.I know they have there 5-6
> memory registration methods.. but if we stick to their mode that uses
> fast registration ala your upstream commit 0f7ec3 "RDMA/core: Add memory
> management extensions support"  -- what's missing or how come it work
> w/o the enhancement suggested here?Added Chuck.

NFSRDMA currently checks the transport type to decide how to set the access 
flags for memory registration.  With the new services exported in this series, 
we can change/simplify NFSRDMA to not have to know the transport type.  

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Steve Wise


> -Original Message-
> From: Haggai Eran [mailto:hagg...@mellanox.com]
> Sent: Tuesday, June 30, 2015 2:25 AM
> To: Steve Wise; dledf...@redhat.com
> Cc: r...@mellanox.com; sa...@mellanox.com; linux-rdma@vger.kernel.org; 
> jguntho...@obsidianresearch.com; infinip...@intel.com;
> e...@mellanox.com; ogerl...@mellanox.com; sean.he...@intel.com
> Subject: Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags
> 
> On 30/06/2015 00:36, Steve Wise wrote:
> >  /**
> > + * rdma_mr_roles - possible roles an RDMA MR will be used for
> > + *
> > + * This allows a transport independent RDMA application to
> > + * create MRs that are usable for all the desired roles w/o
> > + * having to understand which access rights are needed.
> > + */
> > +enum {
> > +
> > +   /* lkey used in a ib_recv_wr sge */
> > +   RDMA_MRR_RECV   = 1,
> > +
> > +   /* lkey used for a IB_WR_SEND in the ib_send_wr sge */
> > +   RDMA_MRR_SEND   = (1<<1),
> Perhaps you should mention that this covers all the IB_WR_SEND* opcodes
> (SEND_WITH_IMM and SEND_WITH_INV). READ, WRITE and ATOMICs also have
> several variants.
> 

Agreed.

> > +
> > +   /* rkey used for a IB_WR_RDMA_READ in ib_send_wr wr.rdma.rkey */
> > +   RDMA_MRR_READ_SOURCE= (1<<2),
> > +
> > +   /* lkey used for a IB_WR_RDMA_READ in the ib_send_wr sge */
> > +   RDMA_MRR_READ_DEST  = (1<<3),
> > +
> > +   /* lkey used for a IB_WR_RDMA_WRITE in the ib_send_wr sge */
> > +   RDMA_MRR_WRITE_SOURCE   = (1<<4),
> > +
> > +   /* rkey used for a IB_WR_RDMA_WRITE in ib_send_wr wr.rdma.rkey */
> > +   RDMA_MRR_WRITE_DEST = (1<<5),
> > +
> > +   /*
> > +* rkey used for a IB_WR_ATOMIC/MASKED_ATOMIC in ib_send_wr
> > +* wr.atomic.rkey
> > +*/
> > +   RDMA_MRR_ATOMIC = (1<<6),
> What about using as an lkey in an IB_WR_ATOMIC/MASKED_ATOMIC in the
> ib_send_wr sge? Do you want that to be covered by RDMA_MRR_SEND?
>

Ah yes.  Perhaps we need ATOMIC_SOURCE and ATOMIC_DEST.  I wouldn't include it 
in the SEND.
 
> > +
> > +   /* MR used for a IB_WR_MW_BIND in ib_send_wr wr.bind_mw.bind_info.mr */
> > +   RDMA_MRR_MW_BIND= (1<<7),
> > +};



--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Sagi Grimberg

On 6/30/2015 12:36 AM, Steve Wise wrote:

The semantics for MR access flags are not consistent across RDMA
protocols.  So rather than have applications try and glean what they
need, have them pass in the intended roles and attributes for the MR to
be allocated and let the RDMA core select the appropriate access flags
given the roles, attributes, and device capabilities.

We introduce rdma_mr_roles and rdma_mr_attributes that enumerate the
possible roles and attributes for a MR.  These are documented in the
enums themselves.

New services exported:

rdma_device_access_flags() - given the intended MR roles and attributes
passed in, return the ib_access_flags bits for the device.

rdma_get_dma_mr() - allocate a dma mr using the applications intended
MR roles and MR attributes.  This uses rdma_device_access_flags().

rdma_fast_reg_access_flags() - return the ib_access_flags bits needed
for a fast register WR given the applications intended MR roles and
MR attributes.  This uses rdma_device_access_flags().

Signed-off-by: Steve Wise 
---
  drivers/infiniband/core/verbs.c |   30 
  include/rdma/ib_verbs.h |  101 +++
  2 files changed, 131 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index bac3fb4..2aa7c92 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1144,6 +1144,36 @@ struct ib_mr *ib_get_dma_mr(struct ib_pd *pd, int 
mr_access_flags)
  }
  EXPORT_SYMBOL(ib_get_dma_mr);

+int rdma_device_access_flags(struct ib_pd *pd, int roles, int attrs)
+{
+   int access_flags = attrs;
+
+   if (roles & RDMA_MRR_RECV)
+   access_flags |= IB_ACCESS_LOCAL_WRITE;
+
+   if (roles & RDMA_MRR_WRITE_DEST)
+   access_flags |= IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_WRITE;
+
+   if (roles & RDMA_MRR_READ_DEST) {
+   access_flags |= IB_ACCESS_LOCAL_WRITE;
+   if (rdma_protocol_iwarp(pd->device,
+   rdma_start_port(pd->device)))
+   access_flags |= IB_ACCESS_REMOTE_WRITE;
+   }
+
+   if (roles & RDMA_MRR_READ_SOURCE)
+   access_flags |= IB_ACCESS_REMOTE_READ;
+
+   if (roles & RDMA_MRR_ATOMIC)
+   access_flags |= IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_ATOMIC;
+
+   if (roles & RDMA_MRR_MW_BIND)
+   access_flags |= IB_ACCESS_MW_BIND;
+
+   return access_flags;
+}
+EXPORT_SYMBOL(rdma_device_access_flags);
+
  struct ib_mr *ib_reg_phys_mr(struct ib_pd *pd,
 struct ib_phys_buf *phys_buf_array,
 int num_phys_buf,
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 986fddb..135592d 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2494,6 +2494,107 @@ static inline int ib_req_ncomp_notif(struct ib_cq *cq, 
int wc_cnt)
  struct ib_mr *ib_get_dma_mr(struct ib_pd *pd, int mr_access_flags);

  /**
+ * rdma_mr_roles - possible roles an RDMA MR will be used for
+ *
+ * This allows a transport independent RDMA application to
+ * create MRs that are usable for all the desired roles w/o
+ * having to understand which access rights are needed.
+ */
+enum {
+
+   /* lkey used in a ib_recv_wr sge */
+   RDMA_MRR_RECV   = 1,
+
+   /* lkey used for a IB_WR_SEND in the ib_send_wr sge */
+   RDMA_MRR_SEND   = (1<<1),
+
+   /* rkey used for a IB_WR_RDMA_READ in ib_send_wr wr.rdma.rkey */
+   RDMA_MRR_READ_SOURCE= (1<<2),
+
+   /* lkey used for a IB_WR_RDMA_READ in the ib_send_wr sge */
+   RDMA_MRR_READ_DEST  = (1<<3),
+
+   /* lkey used for a IB_WR_RDMA_WRITE in the ib_send_wr sge */
+   RDMA_MRR_WRITE_SOURCE   = (1<<4),
+
+   /* rkey used for a IB_WR_RDMA_WRITE in ib_send_wr wr.rdma.rkey */
+   RDMA_MRR_WRITE_DEST = (1<<5),
+
+   /*
+* rkey used for a IB_WR_ATOMIC/MASKED_ATOMIC in ib_send_wr
+* wr.atomic.rkey
+*/
+   RDMA_MRR_ATOMIC = (1<<6),
+
+   /* MR used for a IB_WR_MW_BIND in ib_send_wr wr.bind_mw.bind_info.mr */
+   RDMA_MRR_MW_BIND= (1<<7),
+};
+
+/**
+ * rdma_mr_attributes - attributes for rdma memory regions
+ */
+enum {
+   RDMA_MRA_ZERO_BASED = IB_ZERO_BASED,
+   RDMA_MRA_ACCESS_ON_DEMAND   = IB_ACCESS_ON_DEMAND,
+};
+
+/**
+ * rdma_device_access_flags - Returns the device-specific MR access flags.
+ * @pd: The protection domain associated with the memory region.
+ * @roles: The intended roles of the MR
+ * @attrs: The desired attributes of the MR
+ *
+ * Use the intended roles from @roles along with @attrs and the device
+ * capabilities to generate the needed access rights.
+ *
+ * Return: the ib_access_flags value needed to allocate the MR.
+ */
+int rdma_device_access_flags(struct ib_pd *pd, int ro

Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Or Gerlitz

On 6/30/2015 12:36 AM, Steve Wise wrote:

The semantics for MR access flags are not consistent across RDMA
protocols.  So rather than have applications try and glean what they
need, have them pass in the intended roles and attributes for the MR to
be allocated and let the RDMA core select the appropriate access flags
given the roles, attributes, and device capabilities.


wait, we have NFSoRDMA (net/sunrpc/xprtrdma/*) in the kernel for years 
and it works on top of both IB/RoCE and iWARP.I know they have there 5-6 
memory registration methods.. but if we stick to their mode that uses 
fast registration ala your upstream commit 0f7ec3 "RDMA/core: Add memory 
management extensions support"  -- what's missing or how come it work 
w/o the enhancement suggested here?Added Chuck.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH V2 3/5] RDMA/core: transport-independent access flags

2015-06-30 Thread Haggai Eran
On 30/06/2015 00:36, Steve Wise wrote:
>  /**
> + * rdma_mr_roles - possible roles an RDMA MR will be used for
> + *
> + * This allows a transport independent RDMA application to
> + * create MRs that are usable for all the desired roles w/o
> + * having to understand which access rights are needed.
> + */
> +enum {
> +
> + /* lkey used in a ib_recv_wr sge */
> + RDMA_MRR_RECV   = 1,
> +
> + /* lkey used for a IB_WR_SEND in the ib_send_wr sge */
> + RDMA_MRR_SEND   = (1<<1),
Perhaps you should mention that this covers all the IB_WR_SEND* opcodes
(SEND_WITH_IMM and SEND_WITH_INV). READ, WRITE and ATOMICs also have
several variants.

> +
> + /* rkey used for a IB_WR_RDMA_READ in ib_send_wr wr.rdma.rkey */
> + RDMA_MRR_READ_SOURCE= (1<<2),
> +
> + /* lkey used for a IB_WR_RDMA_READ in the ib_send_wr sge */
> + RDMA_MRR_READ_DEST  = (1<<3),
> +
> + /* lkey used for a IB_WR_RDMA_WRITE in the ib_send_wr sge */
> + RDMA_MRR_WRITE_SOURCE   = (1<<4),
> +
> + /* rkey used for a IB_WR_RDMA_WRITE in ib_send_wr wr.rdma.rkey */
> + RDMA_MRR_WRITE_DEST = (1<<5),
> +
> + /*
> +  * rkey used for a IB_WR_ATOMIC/MASKED_ATOMIC in ib_send_wr
> +  * wr.atomic.rkey
> +  */
> + RDMA_MRR_ATOMIC = (1<<6),
What about using as an lkey in an IB_WR_ATOMIC/MASKED_ATOMIC in the
ib_send_wr sge? Do you want that to be covered by RDMA_MRR_SEND?

> +
> + /* MR used for a IB_WR_MW_BIND in ib_send_wr wr.bind_mw.bind_info.mr */
> + RDMA_MRR_MW_BIND= (1<<7),
> +};

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html