> On May 9, 2018, at 10:02 AM, Michael Di Domenico <[email protected]> > wrote: > > On Wed, May 9, 2018 at 9:50 AM, Makia Minich > <[email protected]> wrote: >> >> I have an LNET routing question. I’ve attached a quick diagram of the >> current setup; but basically I have two core networks (one infiniband and >> one ethernet) with a set of LNET routers in between. There is storage and >> clients on both sides of these routers and all clients need to see all/most >> storage. All connections, configurations, etc are all working. >> >> The question is, if an LNET router goes down (which does cause some amount >> of reconnect or remapping for any clients attempting to use those routes) >> would this cause any issues or delays for a client’s connection to >> non-routed storage? Put slightly different, if a job on the ethernet clients >> is actively using ethernet storage and the lnet routers go down, will job be >> affected? What about a new job just launching when that lnet router is down? > > just for the sake of clarity when you say "routers down" do you mean > all routers or just one/two?
Thanks for the question, I should have made that clearer. For this question, I was thinking a single router (and no fine-grained routing). I’d also question what would happen if all routers are down: understood that you’d see hangs for any mounts that are LNET-router based, but local-network mounts “should” remain unaffected, right? _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
