This morning I've had both my infiniband and tcp lustre clients hiccup. They 
are evicted from the server presumably as a result of their high load and 
consequent timeouts. My question is- why don't the clients re-connect. The 
infiniband and tcp clients both give the following message when I type "df" - 
Cannot send after transport endpoint shutdown (-108). I've been battling with 
this on and off now for a few months. I've upgraded my infiniband switch 
firmware, all the clients and servers are running the latest version of lustre 
and the lustre patched kernel. Any ideas? 

-Aaron 
_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Reply via email to