That's what I have done for now. I'm just a little OCD about how the conf file 
looks and don't care for 8 lines worth of wraparound. 
Managing is done at the pxeboot/kickstart level and yum. I can dynamically 
install the bits necessary for the various hardware differences (eg: GPUs, MIC 
cards, Infiniband, etc).

Brian Andrus

-----Original Message-----
From: Benjamin Redling [mailto:[email protected]] 
Sent: Wednesday, January 20, 2016 2:00 AM
To: slurm-dev <[email protected]>
Subject: [slurm-dev] Re: NodeName and PartitionName format in slurm.conf


Am 19.01.2016 um 20:37 schrieb Andrus, Brian Contractor:
> I am testing our slurm to replace our torque/moab setup here.
>
> The issue I have is to try and put all our node names in the NodeName 
> and PartitionName entries.
> In our cluster, we name our nodes compute-<rack>-<row> That seems to 
> be problem enough with the abilities to use ranges in slurm, but it is 
> compounded with the fact that the folks put the nodes in keeping 1u of 
> space in between.
> So I have compute-1-[1,3,5,7,9,11...41]

Why not simply use a comma separated list _generated_ from your inventory / DNS 
/ /etc/hosts / etc. .?

When you have outliers (2U, 4U -- do they have more resources too!?) it would 
make sense to group/partition by resources anyway.
What are you using to manage inventory? Most configuration management and 
provisioning tools I know provide you with the necessary tools -- have a look 
at puppetlabs facter (or alternatives).

http://slurm.schedmd.com/slurm.conf.html
<quote>
Multiple node names may be comma separated (e.g. "alpha,beta,gamma") and/or a 
simple node range expression may optionally be used to specify numeric ranges 
of nodes to avoid building a configuration file with large numbers of entries. 
The node range expression can contain one pair of square brackets with a 
sequence of comma separated numbers and/or ranges of numbers separated by a "-" 
(e.g. "linux[0-64,128]", or "lx[15,18,32-33]"). Note that the numeric ranges 
can include one or more leading zeros to indicate the numeric portion has a 
fixed number of digits (e.g. "linux[0000-1023]"). Up to two numeric ranges can 
be included in the expression (e.g. "rack[0-63]_blade[0-41]"). If one or more 
numeric expressions are included, one of them must be at the end of the name 
(e.g. "unit[0-31]rack" is invalid), but arbitrary names can always be used in a 
comma separated list.
</quote>

Complicating that logic wouldn't make much sense to me.
Mapping host names to partitions shouldn't be too hard to script.
In the worst case you copy the full/per-rack/per-resources host list to 
partitions and manually cherry-pick afterwards.

Regards,
Benjamin
--
FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html
vox: +49 3641 9 44323 | fax: +49 3641 9 44321

Reply via email to