Has anyone heard of lustre having trouble mounting when ECMP is used on the compute nodes default gateway?
I'm trying to mount an existing lustre filesystem on a new cluster, where the connections ride over OPA IPoIB, which is then converted to 10ge via four routers. I'm using ECMP to distribute the packets over the four routers. I can mount lustre on other ethernet clients, but not the ones behind my ECMP gateways. Changing the compute node gateway from ECMP to a single device doesn't change anything. I'm not easily able to revert the network side from ECMP to a single route, so i haven't tried that. The output i get from mount is, "failed: Input/output error retries left: 0" syslog on the client and the MGS seem to show that the connection is being broken between the MGS and client during the mount with a "timed oout for slow reply" message. _______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org