On 5/21/2013 6:24 PM, Hefty, Sean wrote:
My first guess is that the server isn't responding to new requests. -
Sean
This is where we're looking now.
Now testing on 17 server with 8 clients per server.
When disabling all RDMA traffic in the test we get 100% RDMA connection
established. So at least we know this is not some fundamental issue with
our setup.
Modifying our code to increasing the priority of RDMA connection
handling to be higher then the RDMA traffic (CQ completions handling) we
still see many UNREACHABLE events. But only after quite a few client got
connected and started pushing traffic (1GB RDMA WRITEs from server to
client).
We are now adding code (via the conn_attr private data) to compare
timestamp between the rdma_conenct, RDMA_CM_EV_CONNECT_REQ, rdma_accept
and on the client events of UNREACHABLE or CONNECTED.
We'll have better understand once we see these results.
thanks,
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html