Problem running rping over Intel adapters

Doug Ledford Wed, 03 Oct 2012 14:41:11 -0700

We ran into this problem when testing rping over Intel/QLogic hardware:

[root@rdmaperf3 ~]# rping -s -a 172.31.2.103 -v
wait for CONNECTED state 10
connect error -1
cma event RDMA_CM_EVENT_REJECTED, error 28
[root@rdmaperf3 ~]#



[root@rdmaperf8 ~]# rping -c -a 172.31.2.103 -v -C 5
cma event RDMA_CM_EVENT_CONNECT_ERROR, error -1
wait for CONNECTED state 4
connect error -1
[root@rdmaperf8 ~]#

Turns out this is because of a couple things:

1) rping, on the client side, clears the conn_params for the newly to be
attempted connection, then sets:

        conn_param.responder_resources = 1;
        conn_param.initiator_depth = 1;
        conn_param.retry_count = 10;

On the accept side, rping clears the conn_params and then sets just the
responder_resources and initiator_depth, without even checking the
incoming requested conn_param values from the incoming cm_id.  So, OK,
you can get away with that since this is a simple test program, but
still not "best programming practices".  However, the important part
here is the retry_count of 10.  That won't work on Intel/QLogic hardware.

2) the qib driver enforces a maximum of 7 for retry_count.  I don't see
anything in the spec that specifies a maximum for this entry, and in
particular I know it doesn't call out for 7 to mean infinite retries
like it does for rnr_retry_count.

I don't think the spec really cares how we solve this, and I don't think
there is a hard limit of 7 for the retry_count like the qib driver
enforces.  On the other hand, the spec doesn't call out a limit on the
retry_count but I would assume each driver has the option to implement
their own "reasonable, implementation defined" limit in a case like this.

So, do we make qib more liberal in its acceptance of retry_count or do
we fix rping to use a smaller number?  Matters not to me...

-- 
Doug Ledford <[email protected]>
              GPG KeyID: 0E572FDD
              http://people.redhat.com/dledford

Infiniband specific RPMs available at
              http://people.redhat.com/dledford/Infiniband

signature.asc
Description: OpenPGP digital signature

Problem running rping over Intel adapters

Reply via email to