On Tue, Dec 2, 2014 at 7:35 AM, Somnath Kotur <[email protected]> wrote: > HI Moni, > Thank you for your comments , please find my response inline. > > >> > The overarching goal behind this patch is to keep RDMA-CM as the >> > central entity that decides the protocol (ROCEV2 /ROCEV1) and hides all the >> address resolution details from applications. >> I agree but with one comment. RDMA-CM, although preferred, is not >> mandatory to establish a connection. The solution needs to take care also for >> apps that don't use RDMA-CM > > Agreed and this patch takes care of that as well thanks to the solution that > was put in place > for IP-based GIDs anyway in `modify_qp` effectively by means of the > ib_resolve_l2_attrs(). > >> >> > This is just a continuation of the existing RDMA-CM design philosophy >> > for IP Based GIDs that lets applications operate over any form of RDMA >> Service in a completely transparent way without any changes to the >> applications themselves. >> > >> > Protocol Resolution(RoCEV2 or V1) must be done even before we send out >> > the 1st connection request packet, the corresponding MAD pkt must be V2 >> for a destination that is behind a router, correct? >> > I am not sure we want to try sending out RoCEv2 connection requests first >> and if that fails ,then fallback/retry with RoCEv1 request. >> I agree but when you send the first packet (request or response) you know >> the SGID index to be used. Knowing that and assuming you chose the SGID >> index correctly there is no need to guess. >> A suggestion for choosing the SGID index is: >> 1. First, populate GID table with GIDs matching the IP address list of the >> corresponding net devices. Each GID will appear twice, one for >> RoCEv1 and one for RoCEv2. >> 2. When connection is being established RDMA-CM will choose one of the >> 2 matching indices based on a preferred protocol attribute for the rmda_id >> (new attr.) 3. Proffered protocol is >> a) global for the node/fabric based on the administrator settings >> b) can be changed by applications that want to override the default >> (requires extra API) > > Agree to point 1 which is what we were thinking of doing as well as part of > the 'override' option. > Still feel that the 'default' protocol resolution should be based on help > from the IP/Networking stack > as that would straight away enable existing applications and majority of > use-case scenarios that use RDMA-CM to seamlessly > work without any change for RoCE V2. > What this means is now RDMA-CM will choose one of the 2 matching GID indices > based on help from the IP stack. > > For this to happen , the 'gid_cache' structure will have to undergo a change > to now incorporate a new 'gid_type' field. > Accordingly the helper functions that find a cached gid entry - > ib_find_cached_gid() will also undergo a change to take into account this new > field. > > I plan to resubmit the patch series making the following changes > a) Populating GID table with 2 entries per IP address > b) Changes to the GID cache structure along with it's helper functions > c) Make sure that RDMA-CM selects/finds the corresponding GID once a network > type is chosen based on help from the IP stack. > > > The only thing missing is the ability for end-nodes connected locally inside > same subnet to communicate using V2 and that > could be provided as an 'override' option using the 'preferred protocol' > attribute setting on a system-wide basis or on an application basis > as you have proposed. > This I propose as an additional patch after the above patch goes in first. > > Does this sound like a plan ? > > Thanks > Som > Hi Som Generally, I agree with the changes you suggested. To summarize (and make sure I got it right) your plan for V2 is to:
1. Associate gid_index with RoCE type in GID table and GID cache. Requires helper functions to search for gid_index according to GID value and type 2. Move the code that monitors IP address changes to populate GID table from device driver to IB/core. Requires each RoCE device driver to implement add_gid_entry/rem_gid_entry functions 3. Remove the RoCE type from address structure (e.g. ib_ah_attr). Leave it only in rdma_addr to keep the hint from the IP stack 4. Choose sgid_index (from all matching entries) in RDMA-CM based on hint from IP stack, user preference and admin policy If so, I guess that we have a plan. Thanks Moni -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
