Sean Hefty wrote: >>Moreover, the solution is one of: >>- the patch you sent >>- enforcing the ULP to call rdma_establish (or cm_establish for direct >>CM consumers) else a repeatedly lost RTU case is not handled. > > or both, or we do nothing and let the connection fail
After implementing rdma_establish(), the solutions that I see are: * Dispatch COMM_EST to the IB CM. This is transparent to the users. A user cannot send a reply until they are told that the connection has been established. * Provide rdma_establish(). As it turns out, clients may still need to wait until they are told that the connection has been established. Before sends can be posted to the QP, it must be transitioned to RTS, which may sleep. * Transition the QP to RTS before sending the REP. This may be a slight spec violation. COMM_EST events are not generated. Users can reply to messages immediately. A lost RTU will result in tearing down the connection. A user could disconnect the connection before it's seen as established by the IB CM, which isn't handled currently. * Combine the previous two solutions. Rdma_establish() would set the connection state, but the QP is already in RTS. The "best" solution is debatable, but I'm leaning towards the last option. Also note that in all cases there's a race where the IB CM can time out a connection at the same time that a message shows up at the receive queue. - Sean _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
