Re: [lustre-discuss] "Not on preferred path" error

Joe Landman Tue, 20 Sep 2016 10:51:18 -0700

On 09/20/2016 01:39 PM, Lewis Hyatt wrote:

Thanks very much for the suggestions. dmesg output is here:
http://pastebin.com/jCafCZiZ
We don't see any disk-related stuff there, and also our GUI shows all
the RAID arrays as being fine.

Hmmm .... I rarely trust GUIs for RAID. Do you have underlying CLItools you can do a sanity check with?

If anything in there jumps out at you, I'd really appreciate your
thoughts! We are almost certainly going to reboot the affected OSS later
today to see how that goes.

Not seeing anything leap out other than two particular targets,twlstr-OST000b and twlstr-OST0006, appear to be "slow". This appears tobe what is causing client evictions, lock bits, etc.

The question is, why are these two OSTs slow. What is the underlyingRAID, how many operations are queued up, etc.?

A tool we recommend for (nearly instantaneous) holistic level views on asystem is glances, which you can install via pip


        pip install glances

then run it as

        glances -t 1

to get a second by second view of your system.  Dstat is also good.

Dumb question ... what does

        swapon -s

report? I am assuming you aren't swapping (and don't have swap enabledon the system, but it never hurts to ask).


--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
e: [email protected]
w: http://scalableinformatics.com
t: @scalableinfo
p: +1 734 786 8423 x121
c: +1 734 612 4615
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] "Not on preferred path" error

Reply via email to