On Wed, May 9, 2018 at 12:07 PM, Makia Minich <[email protected]> wrote: >> On May 9, 2018, at 10:02 AM, Michael Di Domenico <[email protected]> >> wrote: >> >> On Wed, May 9, 2018 at 9:50 AM, Makia Minich >> <[email protected]> wrote: >>> >>> I have an LNET routing question. I’ve attached a quick diagram of the >>> current setup; but basically I have two core networks (one infiniband and >>> one ethernet) with a set of LNET routers in between. There is storage and >>> clients on both sides of these routers and all clients need to see all/most >>> storage. All connections, configurations, etc are all working. >>> >>> The question is, if an LNET router goes down (which does cause some amount >>> of reconnect or remapping for any clients attempting to use those routes) >>> would this cause any issues or delays for a client’s connection to >>> non-routed storage? Put slightly different, if a job on the ethernet >>> clients is actively using ethernet storage and the lnet routers go down, >>> will job be affected? What about a new job just launching when that lnet >>> router is down? >> >> just for the sake of clarity when you say "routers down" do you mean >> all routers or just one/two? > > Thanks for the question, I should have made that clearer. For this question, > I was thinking a single router (and no fine-grained routing). I’d also > question what would happen if all routers are down: understood that you’d see > hangs for any mounts that are LNET-router based, but local-network mounts > “should” remain unaffected, right?
unfortunately i can't lend any advice, my theory though would be the same as yours. losing a single or multiple routers just cuts the bandwidth between the client and the storage. and losing all the routers would stop connectivity to the remote storage from the respective client. i would not expect that loss to prevent communication to other storage servers _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
