Hi Moni,

> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf
> Of Moni Shoua
> Sent: Tuesday, December 02, 2014 12:17 PM
> To: Somnath Kotur
> Cc: [email protected]; [email protected]
> Subject: Re: [PATCH 0/2] Adding support for RoCE V2 specification
> 
> On Tue, Dec 2, 2014 at 7:35 AM, Somnath Kotur
> <[email protected]> wrote:
> > HI Moni,
> >         Thank you for your comments , please find my response inline.
> >
> >
> >> > The overarching goal behind this patch is to keep RDMA-CM as the
> >> > central entity that decides the protocol (ROCEV2 /ROCEV1) and hides
> >> > all the
> >> address resolution details from applications.
> >> I agree but with one comment. RDMA-CM, although preferred,  is not
> >> mandatory to establish a connection. The solution needs to take care
> >> also for apps that don't use RDMA-CM
> >
> > Agreed and this patch takes care of that as well thanks to the
> > solution that was put in place for IP-based GIDs anyway in `modify_qp`
> effectively by means of the ib_resolve_l2_attrs().
> >
> >>
> >> > This  is just a continuation of the existing RDMA-CM design
> >> > philosophy for IP Based GIDs that lets applications operate over
> >> > any form of RDMA
> >> Service in a completely transparent way without any changes to the
> >> applications themselves.
> >> >
> >> > Protocol Resolution(RoCEV2 or V1) must be done even before we send
> >> > out the 1st connection request packet, the corresponding MAD pkt
> >> > must be V2
> >> for a destination that is behind  a router, correct?
> >> > I am not sure we want to try sending out RoCEv2 connection requests
> >> > first
> >> and if that fails ,then fallback/retry with RoCEv1 request.
> >> I agree but when you send the first packet (request or response) you
> >> know the SGID index to be used. Knowing that and assuming you chose
> >> the SGID index correctly there is no need to guess.
> >> A suggestion for choosing the SGID index is:
> >> 1. First, populate GID table with GIDs matching the IP address list
> >> of the corresponding net devices. Each GID will appear twice, one for
> >> RoCEv1 and one for RoCEv2.
> >> 2. When connection is being established RDMA-CM will choose one of
> >> the
> >> 2 matching indices based on a preferred protocol attribute for the
> >> rmda_id (new attr.) 3. Proffered protocol is
> >>      a) global for the node/fabric based on the administrator settings
> >>      b) can be changed by applications that want to override the
> >> default (requires extra API)
> >
> > Agree to point 1 which is what we were thinking of doing as well as part of
> the 'override' option.
> > Still feel that the 'default' protocol resolution should be based on
> > help from the IP/Networking stack as that would straight away enable
> > existing applications and majority of use-case scenarios that use RDMA-CM
> to seamlessly work without any change for RoCE V2.
> > What this means is now RDMA-CM will choose one of the 2 matching GID
> indices based on help from the IP stack.
> >
> > For this to happen , the 'gid_cache' structure will have to undergo a change
> to now incorporate a new 'gid_type' field.
> > Accordingly the helper functions that find a cached gid entry -
> ib_find_cached_gid() will also undergo a change to take into account this
> new field.
> >
> > I plan to resubmit the patch series making the following changes
> > a) Populating GID table with 2 entries per IP address
> > b) Changes to the GID cache structure along with it's helper functions
> > c) Make sure that RDMA-CM selects/finds the corresponding GID once a
> network type is chosen based on help from the IP stack.
> >
> >
> > The only thing missing is the ability for end-nodes connected locally
> > inside same subnet to communicate using V2 and that could be provided
> > as an 'override' option using the 'preferred protocol' attribute setting on 
> > a
> system-wide basis or on an application basis as you have proposed.
> > This I propose as an additional patch after the above patch goes in first.
> >
> > Does this sound like a plan ?
> >
> > Thanks
> > Som
> >
> Hi Som
> Generally, I agree with the changes you suggested. To summarize (and make
> sure I got it right) your plan for V2 is to:
> 
> 1. Associate gid_index with RoCE type in GID table and GID cache.
> Requires helper functions to search for gid_index according to GID value and
> type 2. Move the code that monitors IP address changes to populate GID
> table from device driver to IB/core. Requires each RoCE device driver to
> implement add_gid_entry/rem_gid_entry functions 3. Remove the RoCE
> type from address structure (e.g. ib_ah_attr).
> Leave it only in rdma_addr to keep the hint from the IP stack 4. Choose
> sgid_index (from all matching entries) in RDMA-CM based on hint from IP
> stack, user preference and admin policy

Agree with all your points above , except for point# 2.  That IMHO would be 
more of a code refactoring/cleanup patch 
that can be taken up later once the baseline version is checked in as it's not 
really related to this patch ?
Also, if we do that , then we also have to take care of  the IB case as well 
where GID is something that is obtained from the port/card(query_gid)
as opposed to something that is pushed down from the stack as you are now 
proposing for IP-based GIDs.. i.e implications are much more than
what this patch intends to address.
So I'd rather do this at a later point in time and not club it with V2, sound 
good ?

Thanks
Som



> 
> If so, I guess that we have a plan.
> 
> Thanks
> Moni

Reply via email to