> On May 9, 2018, at 10:02 AM, Michael Di Domenico <[email protected]> 
> wrote:
> 
> On Wed, May 9, 2018 at 9:50 AM, Makia Minich
> <[email protected]> wrote:
>> 
>> I have an LNET routing question. I’ve attached a quick diagram of the 
>> current setup; but basically I have two core networks (one infiniband and 
>> one ethernet) with a set of LNET routers in between. There is storage and 
>> clients on both sides of these routers and all clients need to see all/most 
>> storage. All connections, configurations, etc are all working.
>> 
>> The question is, if an LNET router goes down (which does cause some amount 
>> of reconnect or remapping for any clients attempting to use those routes) 
>> would this cause any issues or delays for a client’s connection to 
>> non-routed storage? Put slightly different, if a job on the ethernet clients 
>> is actively using ethernet storage and the lnet routers go down, will job be 
>> affected? What about a new job just launching when that lnet router is down?
> 
> just for the sake of clarity when you say "routers down" do you mean
> all routers or just one/two?

Thanks for the question, I should have made that clearer. For this question, I 
was thinking a single router (and no fine-grained routing). I’d also question 
what would happen if all routers are down: understood that you’d see hangs for 
any mounts that are LNET-router based, but local-network mounts “should” remain 
unaffected, right?
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to