That's what I have done for now. I'm just a little OCD about how the conf file looks and don't care for 8 lines worth of wraparound. Managing is done at the pxeboot/kickstart level and yum. I can dynamically install the bits necessary for the various hardware differences (eg: GPUs, MIC cards, Infiniband, etc).
Brian Andrus -----Original Message----- From: Benjamin Redling [mailto:[email protected]] Sent: Wednesday, January 20, 2016 2:00 AM To: slurm-dev <[email protected]> Subject: [slurm-dev] Re: NodeName and PartitionName format in slurm.conf Am 19.01.2016 um 20:37 schrieb Andrus, Brian Contractor: > I am testing our slurm to replace our torque/moab setup here. > > The issue I have is to try and put all our node names in the NodeName > and PartitionName entries. > In our cluster, we name our nodes compute-<rack>-<row> That seems to > be problem enough with the abilities to use ranges in slurm, but it is > compounded with the fact that the folks put the nodes in keeping 1u of > space in between. > So I have compute-1-[1,3,5,7,9,11...41] Why not simply use a comma separated list _generated_ from your inventory / DNS / /etc/hosts / etc. .? When you have outliers (2U, 4U -- do they have more resources too!?) it would make sense to group/partition by resources anyway. What are you using to manage inventory? Most configuration management and provisioning tools I know provide you with the necessary tools -- have a look at puppetlabs facter (or alternatives). http://slurm.schedmd.com/slurm.conf.html <quote> Multiple node names may be comma separated (e.g. "alpha,beta,gamma") and/or a simple node range expression may optionally be used to specify numeric ranges of nodes to avoid building a configuration file with large numbers of entries. The node range expression can contain one pair of square brackets with a sequence of comma separated numbers and/or ranges of numbers separated by a "-" (e.g. "linux[0-64,128]", or "lx[15,18,32-33]"). Note that the numeric ranges can include one or more leading zeros to indicate the numeric portion has a fixed number of digits (e.g. "linux[0000-1023]"). Up to two numeric ranges can be included in the expression (e.g. "rack[0-63]_blade[0-41]"). If one or more numeric expressions are included, one of them must be at the end of the name (e.g. "unit[0-31]rack" is invalid), but arbitrary names can always be used in a comma separated list. </quote> Complicating that logic wouldn't make much sense to me. Mapping host names to partitions shouldn't be too hard to script. In the worst case you copy the full/per-rack/per-resources host list to partitions and manually cherry-pick afterwards. Regards, Benjamin -- FSU Jena | JULIELab.de/Staff/Benjamin+Redling.html vox: +49 3641 9 44323 | fax: +49 3641 9 44321
