Nice... thanks for the additional info! I do try and RTFM given time to :)

Merry Christmas to you & the other list members!

Will



Sent with Good (www.good.com)


-----Original Message-----
From: Benjamin Redling 
[benjamin.ra...@uni-jena.de<mailto:benjamin.ra...@uni-jena.de>]
Sent: Saturday, December 24, 2016 05:40 PM Eastern Standard Time
To: slurm-dev
Subject: [slurm-dev] Re: Error showing in slurmd daemon startup



Hi Will,

Am 24.12.2016 um 21:10 schrieb Will Dennis:
> Thanks for helping to interpret the error message… Clear enough to me now.

You're welcome! I wrote a bit brief because I used my mobile.

> I was told (by one of my researchers) that setting “FastSchedule=0” would 
> "tell Slurm to get the hardware info from the node instead of from 
> slum.conf”. A read of the relevant section in 
> https://slurm.schedmd.com/slurm.conf.html shows me it’s a bit more nuanced 
> than that ;)

Good to hear you looked it up yourself and didn't just /consume/ my
answer -- I like that spirit!
I hope that way you'll pick up more and more over time and might help
one day me or someone else too :)


> My node configs in slurm.conf are currently very simple:
>
> NodeName=host01 CPUs=12 State=UNKNOWN
> NodeName=host02 CPUs=12 State=UNKNOWN
> NodeName=host03 CPUs=12 State=UNKNOWN
> NodeName=host04 CPUs=12 State=UNKNOWN
>
> So maybe I could rewrite them as:
>
> NodeName=host01,host02,host03,host04 CPUs=12 SocketsPerBoard=2 
> CoresPerSocket=6 ThreadsPerCore=1

Yes, there's no need to list hosts with identical parameters on seperate
lines.

And even that can be condensed -- if you like:
NodeName=host0[1-4] ...

s. https://slurm.schedmd.com/slurm.conf.html
(Easy to miss, because in-front of NodeName...)
--- %< ---
Multiple node names may be comma separated (e.g. "alpha,beta,gamma")
and/or a simple node range expression may optionally be used to specify
numeric ranges of nodes to avoid building a configuration file with
large numbers of entries. The node range expression can contain one pair
of square brackets with a sequence of comma separated numbers and/or
ranges of numbers separated by a "-" (e.g. "linux[0-64,128]", or
"lx[15,18,32-33]"). Note that the numeric ranges can include one or more
leading zeros to indicate the numeric portion has a fixed number of
digits (e.g. "linux[0000-1023]"). Up to two numeric ranges can be
included in the expression (e.g. "rack[0-63]_blade[0-41]"). If one or
more numeric expressions are included, one of them must be at the end of
the name (e.g. "unit[0-31]rack" is invalid), but arbitrary names can
always be used in a comma separated list.
--- %< ---

And the hidden gem from the slurm download section:
https://www.nsc.liu.se/~kent/python-hostlist/

Merry Christmas!
Benjamin
--
FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html
vox: +49 3641 9 44323 | fax: +49 3641 9 44321

Reply via email to