Re: [lustre-discuss] "Not on preferred path" error
That's the way multipath is showing it, yes, however back in the 1.8 days we used LSI's propriatary multipathing kernel modules called MPP. MPP presented both paths to the device driver layer as a single device, so the multipath view would have a single path. I no longer have any of my notes from this sort of thing, I don't know if there are any old-school LSI/NetApp/Engenio people on here who would have a better chance with diagnosing this sort of thing. -Ben Evans On 9/21/16, 1:37 PM, "Tao, Zhiqi" <zhiqi@intel.com> wrote: >It appears that there is only one SAS path to the back storage, which >explained why some of LUN showed on non-preferred path. > >Typically we recommend to have two SAS connections from each OSS to the >storage. One connects to the upper controller and one connects to the >lower controller. Then, distributed LUNs between two controllers. In the >event of SAS connection failure, all LUNs would failover to one >controller. The one used to go through the other controller would shows >that they are not on the preferred path. As this kind of failover >happened on the multipath layer, it's transparent to Lustre. The file >system continues to run as you observed. > >Best Regards, >Zhiqi > >-Original Message- >From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On >Behalf Of Lewis Hyatt >Sent: Tuesday, September 20, 2016 12:53 PM >To: Ben Evans <bev...@cray.com>; lustre-discuss@lists.lustre.org >Subject: Re: [lustre-discuss] "Not on preferred path" error > >I see, thanks. This is what we see from running multipath cmds... i don't >see anything that means anything to me, but FWIW it looks the same as on >our other OSS that is working ok. > >$multipath -ll >map03 (360080e50002ee510023f50092c6c) dm-13 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][active] > \_ 3:0:1:3 sdk 8:160 [active][ready] >map02 (360080e50002ee410024250092c11) dm-12 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][active] > \_ 3:0:1:2 sdj 8:144 [active][ready] >map01 (360080e50002ee510023b50092c4c) dm-11 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][enabled] > \_ 3:0:1:1 sdi 8:128 [active][ready] >map00 (360080e50002ee410023e50092bf2) dm-10 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][enabled] > \_ 3:0:1:0 sdh 8:112 [active][ready] >map09 (360080e50002ee4dc02f250092c62) dm-7 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][enabled] > \_ 3:0:0:3 sde 8:64 [active][ready] >map11 (360080e50002ee4dc02f650092c84) dm-9 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][active] > \_ 3:0:0:5 sdg 8:96 [active][ready] >map08 (360080e50002ec89002e550092a07) dm-6 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][enabled] > \_ 3:0:0:2 sdd 8:48 [active][ready] >map10 (360080e50002ec89002e950092a27) dm-8 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][active] > \_ 3:0:0:4 sdf 8:80 [active][ready] >map07 (360080e50002ee4dc02ee50092c44) dm-5 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][active] > \_ 3:0:0:1 sdc 8:32 [active][ready] >map06 (360080e50002ec89002e1500929e9) dm-4 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][active] > \_ 3:0:0:0 sdb 8:16 [active][ready] >map05 (360080e50002ee510024350092c8c) dm-15 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][enabled] > \_ 3:0:1:5 sdm 8:192 [active][ready] >map04 (360080e50002ee410024650092c31) dm-14 LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][rw] >\_ round-robin 0 [prio=1][enabled] > \_ 3:0:1:4 sdl 8:176 [active][ready] > >=== > >$multipath -r >reload: map06 (360080e50002ec89002e1500929e9) LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][n/a] >\_ round-robin 0 [prio=1][undef] > \_ 3:0:0:0 sdb 8:16 [active][ready] >reload: map07 (360080e50002ee4dc02ee50092c44) LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][n/a] >\_ round-robin 0 [prio=1][undef] > \_ 3:0:0:1 sdc 8:32 [active][ready] >reload: map08 (360080e50002ec89002e550092a07) LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][n/a] >\_ round-robin 0 [prio=1][undef] > \_ 3:0:0:2 sdd 8:48 [active][ready] >reload: map09 (360080e50002ee4dc02f250092c62) LSI,VirtualDisk >[size=15T][features=0][hwhandler=0][n/a] >\_ round-robin 0 [prio=1][undef] > \_ 3:0:0:3 sde 8:64 [active][ready] >reloa
Re: [lustre-discuss] "Not on preferred path" error
It appears that there is only one SAS path to the back storage, which explained why some of LUN showed on non-preferred path. Typically we recommend to have two SAS connections from each OSS to the storage. One connects to the upper controller and one connects to the lower controller. Then, distributed LUNs between two controllers. In the event of SAS connection failure, all LUNs would failover to one controller. The one used to go through the other controller would shows that they are not on the preferred path. As this kind of failover happened on the multipath layer, it's transparent to Lustre. The file system continues to run as you observed. Best Regards, Zhiqi -Original Message- From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf Of Lewis Hyatt Sent: Tuesday, September 20, 2016 12:53 PM To: Ben Evans <bev...@cray.com>; lustre-discuss@lists.lustre.org Subject: Re: [lustre-discuss] "Not on preferred path" error I see, thanks. This is what we see from running multipath cmds... i don't see anything that means anything to me, but FWIW it looks the same as on our other OSS that is working ok. $multipath -ll map03 (360080e50002ee510023f50092c6c) dm-13 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:1:3 sdk 8:160 [active][ready] map02 (360080e50002ee410024250092c11) dm-12 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:1:2 sdj 8:144 [active][ready] map01 (360080e50002ee510023b50092c4c) dm-11 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:1:1 sdi 8:128 [active][ready] map00 (360080e50002ee410023e50092bf2) dm-10 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:1:0 sdh 8:112 [active][ready] map09 (360080e50002ee4dc02f250092c62) dm-7 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:0:3 sde 8:64 [active][ready] map11 (360080e50002ee4dc02f650092c84) dm-9 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:0:5 sdg 8:96 [active][ready] map08 (360080e50002ec89002e550092a07) dm-6 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:0:2 sdd 8:48 [active][ready] map10 (360080e50002ec89002e950092a27) dm-8 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:0:4 sdf 8:80 [active][ready] map07 (360080e50002ee4dc02ee50092c44) dm-5 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:0:1 sdc 8:32 [active][ready] map06 (360080e50002ec89002e1500929e9) dm-4 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:0:0 sdb 8:16 [active][ready] map05 (360080e50002ee510024350092c8c) dm-15 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:1:5 sdm 8:192 [active][ready] map04 (360080e50002ee410024650092c31) dm-14 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:1:4 sdl 8:176 [active][ready] === $multipath -r reload: map06 (360080e50002ec89002e1500929e9) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:0 sdb 8:16 [active][ready] reload: map07 (360080e50002ee4dc02ee50092c44) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:1 sdc 8:32 [active][ready] reload: map08 (360080e50002ec89002e550092a07) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:2 sdd 8:48 [active][ready] reload: map09 (360080e50002ee4dc02f250092c62) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:3 sde 8:64 [active][ready] reload: map10 (360080e50002ec89002e950092a27) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:4 sdf 8:80 [active][ready] reload: map11 (360080e50002ee4dc02f650092c84) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:5 sdg 8:96 [active][ready] reload: map00 (360080e50002ee410023e50092bf2) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:1:0 sdh 8:112 [active][ready] reload: map01 (360080e50002ee510023b50092c4c) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:1:1 sdi 8:128 [active][ready] reload: map02 (360080e50002ee410024250092c11) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:1:2 sdj 8:144 [active][read
Re: [lustre-discuss] "Not on preferred path" error
I see, thanks. This is what we see from running multipath cmds... i don't see anything that means anything to me, but FWIW it looks the same as on our other OSS that is working ok. $multipath -ll map03 (360080e50002ee510023f50092c6c) dm-13 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:1:3 sdk 8:160 [active][ready] map02 (360080e50002ee410024250092c11) dm-12 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:1:2 sdj 8:144 [active][ready] map01 (360080e50002ee510023b50092c4c) dm-11 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:1:1 sdi 8:128 [active][ready] map00 (360080e50002ee410023e50092bf2) dm-10 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:1:0 sdh 8:112 [active][ready] map09 (360080e50002ee4dc02f250092c62) dm-7 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:0:3 sde 8:64 [active][ready] map11 (360080e50002ee4dc02f650092c84) dm-9 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:0:5 sdg 8:96 [active][ready] map08 (360080e50002ec89002e550092a07) dm-6 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:0:2 sdd 8:48 [active][ready] map10 (360080e50002ec89002e950092a27) dm-8 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:0:4 sdf 8:80 [active][ready] map07 (360080e50002ee4dc02ee50092c44) dm-5 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:0:1 sdc 8:32 [active][ready] map06 (360080e50002ec89002e1500929e9) dm-4 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][active] \_ 3:0:0:0 sdb 8:16 [active][ready] map05 (360080e50002ee510024350092c8c) dm-15 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:1:5 sdm 8:192 [active][ready] map04 (360080e50002ee410024650092c31) dm-14 LSI,VirtualDisk [size=15T][features=0][hwhandler=0][rw] \_ round-robin 0 [prio=1][enabled] \_ 3:0:1:4 sdl 8:176 [active][ready] === $multipath -r reload: map06 (360080e50002ec89002e1500929e9) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:0 sdb 8:16 [active][ready] reload: map07 (360080e50002ee4dc02ee50092c44) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:1 sdc 8:32 [active][ready] reload: map08 (360080e50002ec89002e550092a07) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:2 sdd 8:48 [active][ready] reload: map09 (360080e50002ee4dc02f250092c62) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:3 sde 8:64 [active][ready] reload: map10 (360080e50002ec89002e950092a27) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:4 sdf 8:80 [active][ready] reload: map11 (360080e50002ee4dc02f650092c84) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:0:5 sdg 8:96 [active][ready] reload: map00 (360080e50002ee410023e50092bf2) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:1:0 sdh 8:112 [active][ready] reload: map01 (360080e50002ee510023b50092c4c) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:1:1 sdi 8:128 [active][ready] reload: map02 (360080e50002ee410024250092c11) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:1:2 sdj 8:144 [active][ready] reload: map03 (360080e50002ee510023f50092c6c) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:1:3 sdk 8:160 [active][ready] reload: map04 (360080e50002ee410024650092c31) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:1:4 sdl 8:176 [active][ready] reload: map05 (360080e50002ee510024350092c8c) LSI,VirtualDisk [size=15T][features=0][hwhandler=0][n/a] \_ round-robin 0 [prio=1][undef] \_ 3:0:1:5 sdm 8:192 [active][ready] Thanks again for the assistance all, I really appreciate it. -lewis On 9/20/16 2:48 PM, Ben Evans wrote: multipath is a linux utility which handles communications from the server to the disk array. It is independent of Lustre or Infiniband. For OSSes, each OSS had 2 connections to each storage array it communicated with, usually there were a pair of arrays per OSS pair (except for in a rare handful of our systems which had 1). -Ben Evans On 9/20/16, 2:33 PM, "lustre-discuss on behalf of Lewis
Re: [lustre-discuss] "Not on preferred path" error
multipath is a linux utility which handles communications from the server to the disk array. It is independent of Lustre or Infiniband. For OSSes, each OSS had 2 connections to each storage array it communicated with, usually there were a pair of arrays per OSS pair (except for in a rare handful of our systems which had 1). -Ben Evans On 9/20/16, 2:33 PM, "lustre-discuss on behalf of Lewis Hyatt"wrote: >Thanks so much for the information, we will look into this asap. >Forgive my ignorance, but is multipath here referring to some >lustre-specific >or infiniband-related process? Not familiar with it in this context. >Thanks again. > >-lewis > > >On 9/20/16 2:24 PM, Ben Evans wrote: >> Lewis, >> >> Yes, "Not on preferred path" is something that bubbles up through the TS >> gui from multipath. >> >> A simple thing you can check is running multipath -ll on the OSS (and >>it's >> peer) in question and seeing if it reports that one or more path is >>down. >> If it's just on one OSS, try running 'multipath -r'. If it doesn't come >> back and look OK, then it's most likely a cable issue, and you can try >> re-seating it to see if it helps. It's been a long time since I >>diagnosed >> this, though and can't remember the details of how to associate cables >> with paths, though there should be indicator lights on the back of >> everything and the path that is down should be red. >> >> The high load is probably associated with the cable issue, since you're >> putting more strain on one path. >> >> -Ben Evans >> >> On 9/20/16, 12:21 PM, "lustre-discuss on behalf of Lewis Hyatt" >> >> wrote: >> >>> Hello- >>> >>> I am having an issue with a lustre 1.8 array that I have little hope >>> of figuring out on my own, so I thought I would try here to see if >>> anyone might know what this warning/error means. Our array was built >>> by Terascala, which no longer exists, so we have no support for it and >>> little documentation (and not much in-house knowledge). I see this >>> complaint "Not on preferred path" on the GUI that we have, which I >>> assume was something custom made by Terascala, and I am not sure even >>> what path it is referring to; we use infiniband for all connections >>> and it could relate to this, but not sure. We see this error on 3 of >>> the 12 OSTs. More specifically, we have 2 OSSs, each handling 6 OSTs, >>> and all 3 of the "not on optimal path" OSTs are on the same OSS. >>> >>> We do not know if it's related, but this same OSS is in a very bad >>> state, with very high load average (200), very high I/O wait time, and >>> taking many seconds to respond to each read request, making the array >>> more or less unusable. That's the problem we are trying to fix. >>> >>> I realize there's not much hope for anyone to help us with that given >>> how little information I am able to provide. But I was hoping someone >>> out there might know what this "not on optimal path" error means, and >>> if it matters for anything or not, so we have somewhere to start. >>> Thanks very much! >>> >>> I could provide screen shots of the management GUI we have, if it >>> would be informative. >>> >>> -Lewis >>> ___ >>> lustre-discuss mailing list >>> lustre-discuss@lists.lustre.org >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> >___ >lustre-discuss mailing list >lustre-discuss@lists.lustre.org >http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] "Not on preferred path" error
Thanks so much for the information, we will look into this asap. Forgive my ignorance, but is multipath here referring to some lustre-specific or infiniband-related process? Not familiar with it in this context. Thanks again. -lewis On 9/20/16 2:24 PM, Ben Evans wrote: Lewis, Yes, "Not on preferred path" is something that bubbles up through the TS gui from multipath. A simple thing you can check is running multipath -ll on the OSS (and it's peer) in question and seeing if it reports that one or more path is down. If it's just on one OSS, try running 'multipath -r'. If it doesn't come back and look OK, then it's most likely a cable issue, and you can try re-seating it to see if it helps. It's been a long time since I diagnosed this, though and can't remember the details of how to associate cables with paths, though there should be indicator lights on the back of everything and the path that is down should be red. The high load is probably associated with the cable issue, since you're putting more strain on one path. -Ben Evans On 9/20/16, 12:21 PM, "lustre-discuss on behalf of Lewis Hyatt"wrote: Hello- I am having an issue with a lustre 1.8 array that I have little hope of figuring out on my own, so I thought I would try here to see if anyone might know what this warning/error means. Our array was built by Terascala, which no longer exists, so we have no support for it and little documentation (and not much in-house knowledge). I see this complaint "Not on preferred path" on the GUI that we have, which I assume was something custom made by Terascala, and I am not sure even what path it is referring to; we use infiniband for all connections and it could relate to this, but not sure. We see this error on 3 of the 12 OSTs. More specifically, we have 2 OSSs, each handling 6 OSTs, and all 3 of the "not on optimal path" OSTs are on the same OSS. We do not know if it's related, but this same OSS is in a very bad state, with very high load average (200), very high I/O wait time, and taking many seconds to respond to each read request, making the array more or less unusable. That's the problem we are trying to fix. I realize there's not much hope for anyone to help us with that given how little information I am able to provide. But I was hoping someone out there might know what this "not on optimal path" error means, and if it matters for anything or not, so we have somewhere to start. Thanks very much! I could provide screen shots of the management GUI we have, if it would be informative. -Lewis ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] "Not on preferred path" error
On 09/20/2016 01:39 PM, Lewis Hyatt wrote: Thanks very much for the suggestions. dmesg output is here: http://pastebin.com/jCafCZiZ We don't see any disk-related stuff there, and also our GUI shows all the RAID arrays as being fine. Hmmm I rarely trust GUIs for RAID. Do you have underlying CLI tools you can do a sanity check with? If anything in there jumps out at you, I'd really appreciate your thoughts! We are almost certainly going to reboot the affected OSS later today to see how that goes. Not seeing anything leap out other than two particular targets, twlstr-OST000b and twlstr-OST0006, appear to be "slow". This appears to be what is causing client evictions, lock bits, etc. The question is, why are these two OSTs slow. What is the underlying RAID, how many operations are queued up, etc.? A tool we recommend for (nearly instantaneous) holistic level views on a system is glances, which you can install via pip pip install glances then run it as glances -t 1 to get a second by second view of your system. Dstat is also good. Dumb question ... what does swapon -s report? I am assuming you aren't swapping (and don't have swap enabled on the system, but it never hurts to ask). -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. e: land...@scalableinformatics.com w: http://scalableinformatics.com t: @scalableinfo p: +1 734 786 8423 x121 c: +1 734 612 4615 ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] "Not on preferred path" error
Thanks very much for the suggestions. dmesg output is here: http://pastebin.com/jCafCZiZ We don't see any disk-related stuff there, and also our GUI shows all the RAID arrays as being fine. If anything in there jumps out at you, I'd really appreciate your thoughts! We are almost certainly going to reboot the affected OSS later today to see how that goes. We're a fairly small team (12 people or so) so I have a good feel what everyone is doing and they should not be abusing it too badly... We did recently ask people to delete small files they may have, do you think deletion of a lot of small files could trigger such issues? Thanks again! -lewis On 9/20/16 12:29 PM, Joe Landman wrote: On 09/20/2016 12:21 PM, Lewis Hyatt wrote: We do not know if it's related, but this same OSS is in a very bad state, with very high load average (200), very high I/O wait time, and taking many seconds to respond to each read request, making the array more or less unusable. That's the problem we are trying to fix. This sounds like a storage system failure. Queuing up of IOs to drive the load to 200 usually means something is broken elsewhere in the stack at a lower level. Not always ... sometimes you have users who like to write several million/billion small ( < 100 byte ) files. What does dmesg report? Try to do a pastebin/gist of it, and point it to the list. Things that come to mind are a) offlined RAID (most likely): This would explain the user load, and all sorts of strange messages about block devices and file systems in the logs b) A user DoS against the storage: usually someone writing many tiny files. There are other possibilities, but these seem more likely. ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] "Not on preferred path" error
Stabbing in the dark, but this sounds like a multipath problem. Perhaps you have 2 or more paths to the storage, and one or more of them is down for some reason, perhaps the hardware itself, perhaps a cable is pulled You could look for LEDs in a bad state. I always find it instructive to reboot such a system and watch what comes up on the console during the startup. bob On 9/20/2016 12:29 PM, Joe Landman wrote: On 09/20/2016 12:21 PM, Lewis Hyatt wrote: We do not know if it's related, but this same OSS is in a very bad state, with very high load average (200), very high I/O wait time, and taking many seconds to respond to each read request, making the array more or less unusable. That's the problem we are trying to fix. This sounds like a storage system failure. Queuing up of IOs to drive the load to 200 usually means something is broken elsewhere in the stack at a lower level. Not always ... sometimes you have users who like to write several million/billion small ( < 100 byte ) files. What does dmesg report? Try to do a pastebin/gist of it, and point it to the list. Things that come to mind are a) offlined RAID (most likely): This would explain the user load, and all sorts of strange messages about block devices and file systems in the logs b) A user DoS against the storage: usually someone writing many tiny files. There are other possibilities, but these seem more likely. ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
Re: [lustre-discuss] "Not on preferred path" error
On 09/20/2016 12:21 PM, Lewis Hyatt wrote: We do not know if it's related, but this same OSS is in a very bad state, with very high load average (200), very high I/O wait time, and taking many seconds to respond to each read request, making the array more or less unusable. That's the problem we are trying to fix. This sounds like a storage system failure. Queuing up of IOs to drive the load to 200 usually means something is broken elsewhere in the stack at a lower level. Not always ... sometimes you have users who like to write several million/billion small ( < 100 byte ) files. What does dmesg report? Try to do a pastebin/gist of it, and point it to the list. Things that come to mind are a) offlined RAID (most likely): This would explain the user load, and all sorts of strange messages about block devices and file systems in the logs b) A user DoS against the storage: usually someone writing many tiny files. There are other possibilities, but these seem more likely. -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics, Inc. e: land...@scalableinformatics.com w: http://scalableinformatics.com t: @scalableinfo p: +1 734 786 8423 x121 c: +1 734 612 4615 ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org