Hi, 

Probably the wording of the subject doesn’t actually cover the issue, what we 
see is this : 
We have a client behind a router (linking tcp to Omnipath) that shows an 
inactive OST (all on 2.15.5). 
Other clients that go through the router do not have this issue. 

One client had the same issue, although it showed a different OST as inactive. 
After a reboot, all was well again on that machine. 

The clients can lctl ping the OSSs. 

So although we have a workaround (reboot the client), it would be nice to: 

1. Fix the issue without a reboot 
2. Fix the underlying issue. 

It might be unrelated, but we also see another routing issue every now and 
then: 
The router stops routing request toward a certain OSS, and this can be fixed by 
deleting the peer_nid of the OSS from the router. 

I am probably missing informative logs, but I’m more than happy to try to 
generate them, if somebody has a pointer to how. 

We are a bit stumped right now. 

With kind regards, 

-- 
Jan van Haarst 
HPC Administrator 
For Anunna/HPC questions, please use https://support.wur.nl 
<https://support.wur.nl> (with HPC as service) 
Aanwezig: maandag, dinsdag, donderdag & vrijdag 
Facilitair Bedrijf, onderdeel van Wageningen University & Research 
Afdeling Informatie Technologie 
Postbus 59, 6700 AB, Wageningen 
Gebouw 116, Akkermaalsbos 12, 6700 WB, Wageningen 
http://www.wur.nl/nl/Disclaimer.htm <http://www.wur.nl/nl/Disclaimer.htm> 




Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to