>As I see it, the issue here is that from the view point of upper layers >(TCP, UDP, etc) the IP service is expected to provide unreliable >service. Hence layers that do need reliability such TCP, add that in >their protocol, so adding it in the IP layer and below (eg IPoIB or the >HW it uses) is in a way redundant since the upper layer is not aware to >that.
IMO, the fact that TCP implements reliability doesn't mean it's unnecessary in underlying layers. For example, wireless typically adds reliability at the link layer because the link itself is so unreliable. If adding in reliability in the underlying layers improves overall performance, then it makes sense to add it, independent of the upper level protocol. Since RC is our 'link layer', overrunning the receiver doesn't just result in IP resending the packet, but transitioning the QP into an error state, cleaning up, re-establishing the connection, and then resending the packet. This works, just not well based on what Pradeep has seen. >With all that, I am not religiously against adding the retries... >however, I prefer to understand the original problem which seems to be >an issue relates to HCA interoperability before putting the solution in >the code. We both agree that UC is the way to go, and in that case the >real problem would pop again, but higher layers would have to take care >of it. I definitely think UC is worth trying, but I would like to see how it performs against RC. UC doesn't quite have the same issue as RC, since overrunning the receiver doesn't require tearing down the connection. - Sean _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
