Hello all,
I'm new to lustre and was hoping someone could enlighten me to a few
questions. We're running lustre version lustre.1.4.7.1 on centOS kernel
2.6.9-42. We currently have 4 OSS's, one MDS and 11 compute nodes.
We've had quite a log of issues lately which required rebooting everything.
Lustre currently is set to be started manually and not as a service due to
dependency issues we've faced. So I guess my first question is:
-What is the proper order in which this type of config should booted? I
should also mention that we also have a "head node" which I believe is no more
than a client itself.
When the systems are up, I start the services for lustre and the status shows
that lustre is running, but I still have to tell lustre where my xml file is so
it loads (lconf --node client /lustre/config.xml). Why?
And finally, we've had plenty of I/O errors which (storage interconnect vi
infiniband over IP) we believe may be due to timeout issues. What are the "best
practices" for setting ldlm timeouts and timeouts for such a config? and can I
set those permanently in the configuration rather than having to use sysctl,
maybe through lcm or lctl?
Thanks a bunch!
---------------------------------
Everyone is raving about the all-new Yahoo! Mail beta.
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss