>On the other hand, if the remote host is actually down, you will make "retry >storms" >worse by retrying both at the link layer AND at the TCP layer (each TCP retry >resulting >in multiple lower-layer retries). This will have an effect on the fabric.
I don't think I would call retrying a send a few more times a storm; it's a point to point send. When the remote host drops, the first think IPoIB will do is try to reconnect, which involves sending CM MADs to the unavailable node in an effort to restablish the connection anyway. I don't think we try optimizing for the case when systems crash. In any case, I thought the problem was more related to RNR Nacks than simple retries, but that doesn't seem to be the case. - Sean _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
