I think there are several application, that should be fixed and tested with your propose. See, for example, __ndi_fill_cm_rep()
BTW, could someone remind me why do we have ib_cm_rep (which doesn't contain resp_res at all) and iba_cm_rep ? Is the first one regarded as outdated ? Do we have two CM APIs ? What's the diff ? > -----Original Message----- > From: Fab Tillier [mailto:[email protected]] > Sent: Monday, June 27, 2011 7:07 PM > To: Leonid Keller; Hefty, Sean > Cc: ofw_list > Subject: RE: IBAL CM ignores responder resources of the passive side > > Leonid Keller wrote on Mon, 27 Jun 2011 at 00:54:40 > > > We did the below patch and got instantly the assert. > > Running which ULP in WinOFED? Looking at the code, the ND provider does the > right thing. WSD seems to not set resp_res and I'd say that was a bug. Since > the WSD provider gets updated with any driver update, there's little chance of > users running into this if both get fixed concurrently. > > > Which brought me to understanding, that assert should be removed, but 'if' > > should be kept. > > I think it's important to allow the CEP Manager to support 0 as a valid value, > indicating that no RDMA reads are supported. I would keep the assert and > remove the if. If you had a different value to indicate "auto-negotiate" you > could have the proxy trap for zero and convert before passing down to the CEP > manager. > > > Why? > > Because the active side do not look at ReponderResources and InitDepth of > > the passive side (sent in REP). > > It allows CM to set all the parameters. > > The old code gave to the passive side a possibility not to negotiate > > parameters, but to leave it to CM. > > And I think, that we should keep this possibility. > > Both the active and passive sides should support auto-negotiating, but it > shouldn't be done using the value 0. > > I think clients should be able to specify whatever values they desire, and the > CM should then cap to the abilities of the hardware/negotiate on behalf of the > client. The actual values can always be examined during connection > establishment (when retrieving the QP attributes for QP transitions) or > queried from the QP. > > If an application knows that it could have up to 8 RDMA reads in flight at a > time, it could always use 8 as the value. If the hardware doesn't support 8, > it would cap it to whatever it does support. The client would only get an > error if the underlying hardware had a limit of zero (unable to do RDMA > reads), though I'm not sure that's even allowed by the IB standard. > > -Fab > > >> -----Original Message----- > >> From: Leonid Keller > >> Sent: Thursday, June 23, 2011 3:08 PM > >> To: 'Fab Tillier'; Hefty, Sean > >> Cc: ofw_list > >> Subject: RE: IBAL CM ignores responder resources of the passive side > >> > >> I'd say, it's not "broken" applications, but applications, that do not > >> care what Responder Resources should be. And if an application in a > >> third company works with previous drivers on 1000 computers and now > >> they get new drivers and install them on 1000 computers and suddenly > >> the application fails ... > >> > >> I'm going to add in our code the following lines: > >> > >> ASSERT(p_cm_rep->resp_res); > >> if (p_cm_rep->resp_res) > >> p_cep->resp_res = p_cm_rep->resp_res; > >> You may think of leaving in the OFED code only the last line. > >> > >> > >>> -----Original Message----- > >>> From: Fab Tillier [mailto:[email protected]] > >>> Sent: Thursday, June 23, 2011 2:53 AM > >>> To: Leonid Keller; Hefty, Sean > >>> Cc: ofw_list > >>> Subject: RE: IBAL CM ignores responder resources of the passive side > >>> > >>> I'd just let existing broken apps get fixed and fix it in the CM > >>> properly. That is, do the straight assignment, not the conditional > >>> assignment. > >>> > >>> Cheers, > >>> -Fab > >>> > >>> -----Original Message----- > >>> From: Leonid Keller [mailto:[email protected]] > >>> Sent: Wednesday, June 22, 2011 10:03 AM > >>> To: Hefty, Sean; Fab Tillier > >>> Cc: ofw_list > >>> Subject: RE: IBAL CM ignores responder resources of the passive side > >>> > >>> I'm afraid that just > >>> > >>> p_cep->resp_res = p_cm_rep->resp_res; > >>> > >>> can break some existing applications, which do not fill p_cm_rep- > >>>> resp_res and still work. > >>> Maybe > >>> > >>> if (p_cm_rep->resp_res) > >>> p_cep->resp_res = p_cm_rep->resp_res; > >>> > >>>> -----Original Message----- > >>>> From: Hefty, Sean [mailto:[email protected]] > >>>> Sent: Wednesday, June 22, 2011 7:04 PM > >>>> To: Leonid Keller; Fab Tillier > >>>> Cc: ofw_list > >>>> Subject: RE: IBAL CM ignores responder resources of the passive side > >>>> > >>>> without looking at more of the code, it looks like a bug > >>>> > >>>> > >>>>> __save_user_rep( > >>>>> > >>>>> IN > >>>>> cep_agent_t* const p_port_cep, > >>>>> > >>>>> IN > >>>>> kcep_t* const p_cep, > >>>>> > >>>>> IN const iba_cm_rep* > >>>> const > >>>>> p_cm_rep, > >>>>> > >>>>> IN > >>>>> uint8_t > >>>>> rnr_nak_timeout ) > >>>>> > >>>>> { > >>>>> > >>>>> AL_ENTER( AL_DBG_CM ); > >>>>> > >>>>> > >>>>> p_cep->local_qpn = p_cm_rep->qpn; > >>>>> > >>>>> p_cep->rq_psn = p_cm_rep->starting_psn; > >>>>> > >>>>> p_cep->init_depth = p_cm_rep->init_depth; > >>>>> > >>>>> > >>>> > >>>> > >>>> I think we can just always do the assignment: > >>>> > >>>> p_cep->resp_res = p_cm_rep->resp_res; > >>>> > >>>> here. Is the p_cep->resp_res the value that actually goes into the > >>>> REP? > >>>> > >>>> > >>>>> ci_ca_lock_attr( p_port_cep->h_ca->obj.p_ci_ca ); > >>>>> > >>>>> /* Check the CA's responder resource max and trim > >>> if > >>>>> necessary. */ > >>>>> > >>>>> if( p_port_cep->h_ca->obj.p_ci_ca->p_pnp_attr- > >>>>>> max_qp_resp_res < > >>>> > >>>> > >>>> whine: The max_qp_resp_res isn't going to dynamically change. We > >>>> should use a saved value somewhere that doesn't require serializing > >>>> all CM users for this check. > >>>> > >>>> - Sean > > _______________________________________________ ofw mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
