multipath is a linux utility which handles communications from the server to the disk array. It is independent of Lustre or Infiniband. For OSSes, each OSS had 2 connections to each storage array it communicated with, usually there were a pair of arrays per OSS pair (except for in a rare handful of our systems which had 1).
-Ben Evans On 9/20/16, 2:33 PM, "lustre-discuss on behalf of Lewis Hyatt" <[email protected] on behalf of [email protected]> wrote: >Thanks so much for the information, we will look into this asap. >Forgive my ignorance, but is multipath here referring to some >lustre-specific >or infiniband-related process? Not familiar with it in this context. >Thanks again. > >-lewis > > >On 9/20/16 2:24 PM, Ben Evans wrote: >> Lewis, >> >> Yes, "Not on preferred path" is something that bubbles up through the TS >> gui from multipath. >> >> A simple thing you can check is running multipath -ll on the OSS (and >>it's >> peer) in question and seeing if it reports that one or more path is >>down. >> If it's just on one OSS, try running 'multipath -r'. If it doesn't come >> back and look OK, then it's most likely a cable issue, and you can try >> re-seating it to see if it helps. It's been a long time since I >>diagnosed >> this, though and can't remember the details of how to associate cables >> with paths, though there should be indicator lights on the back of >> everything and the path that is down should be red. >> >> The high load is probably associated with the cable issue, since you're >> putting more strain on one path. >> >> -Ben Evans >> >> On 9/20/16, 12:21 PM, "lustre-discuss on behalf of Lewis Hyatt" >> <[email protected] on behalf of [email protected]> >> wrote: >> >>> Hello- >>> >>> I am having an issue with a lustre 1.8 array that I have little hope >>> of figuring out on my own, so I thought I would try here to see if >>> anyone might know what this warning/error means. Our array was built >>> by Terascala, which no longer exists, so we have no support for it and >>> little documentation (and not much in-house knowledge). I see this >>> complaint "Not on preferred path" on the GUI that we have, which I >>> assume was something custom made by Terascala, and I am not sure even >>> what path it is referring to; we use infiniband for all connections >>> and it could relate to this, but not sure. We see this error on 3 of >>> the 12 OSTs. More specifically, we have 2 OSSs, each handling 6 OSTs, >>> and all 3 of the "not on optimal path" OSTs are on the same OSS. >>> >>> We do not know if it's related, but this same OSS is in a very bad >>> state, with very high load average (200), very high I/O wait time, and >>> taking many seconds to respond to each read request, making the array >>> more or less unusable. That's the problem we are trying to fix. >>> >>> I realize there's not much hope for anyone to help us with that given >>> how little information I am able to provide. But I was hoping someone >>> out there might know what this "not on optimal path" error means, and >>> if it matters for anything or not, so we have somewhere to start. >>> Thanks very much! >>> >>> I could provide screen shots of the management GUI we have, if it >>> would be informative. >>> >>> -Lewis >>> _______________________________________________ >>> lustre-discuss mailing list >>> [email protected] >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> >_______________________________________________ >lustre-discuss mailing list >[email protected] >http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org _______________________________________________ lustre-discuss mailing list [email protected] http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
