Thanks much for all your replies!
As for the configuration, in the PartitionName= Nodes= the head node is never included in any of the partitions. There is a default NodeName in the slurm configuration set by the install itself. I basically used the same configuration as I used with slurm 2.6.0. That version never ran jobs on the headnode, it also did not start the slurmd on the head node. I removed the NodeName for the headnode and that took care of the slurmd startup. Thanks for the salloc clarification. I am still learning the nuances of slurm coming from other schedulers (sge/loadleveler/torque). Thanks Eva On Fri, 12 Sep 2014, Chrysovalantis Paschoulas wrote: > Hi Eva! > > As Sergio said, you have to specify the compute nodes with "NodeName=..." and > then define partitions including those cnodes with "PartitionName=... > Nodes=.." without including the head nodes or the login nodes. Also you could > set in slurm.conf file the parameter "AllocNodes=..." where usually we give > the login nodes only, in order to disable submission on any nodes other than > the login nodes. > > So my question now, is node hpcdev-005.sdsc.edu a login node or a > master/admin node. I mean from that node or from a different one did you do > the submission? Because if this one is the login node then there is no error > at all, this is the default behaviour of salloc. > > The default salloc (you can change it) returns you a shell on the node where > the submission took place and then with srun commands you can execute > programs on the compute nodes. > > So in case you want an interactive shell on the compute nodes then you should > execute: > > "salloc -N1 -p active srun -N1 --pty sh" > > or directly an srun command (without salloc involved): > > "srun -N1 -p active --pty sh" > > Best Regards, > Chrysovalantis Paschoulas > > > > On 09/12/2014 09:19 AM, Sergio Iserte wrote: > Hello Eva, > you must remove the management nodes from the field "Nodes" of the > "PartitionName" parameter. > > With the slurm.conf file would be easier to write an example, anyway this > should work! > > Regards, > Sergio. > > 2014-09-12 9:06 GMT+02:00 Uwe Sauter > <[email protected]<mailto:[email protected]>>: > > Hi Eva, > > if you don't want to use the controller node for jobs, the easiest way > is to not configure it as node at all. Meaning you don't need a line like > > NodeName=hpc-0-5 RealMemory=.... > > for the controller. > > > A program/user can find out which nodes are allocated by looking into > the environment variables. Try running salloc and then > > $ env | grep SLURM > > Here is an example output: > > SLURM_NODELIST=n523601 > SLURM_NODE_ALIASES=(null) > SLURM_NNODES=1 > SLURM_JOBID=6437 > SLURM_TASKS_PER_NODE=40 > SLURM_JOB_ID=6437 > SLURM_SUBMIT_DIR=/nfs/admins/adm17 > SLURM_JOB_NODELIST=n523601 > SLURM_JOB_CPUS_PER_NODE=40 > SLURM_SUBMIT_HOST=frontend > SLURM_JOB_PARTITION=foo > SLURM_JOB_NUM_NODES=1 > > > > Regards, > > Uwe > > > > Am 12.09.2014 um 00:45 schrieb Eva Hocks: > > > > > > > > I am trying to configure the latest slurm 14.03 and am running into > > problem to prevent slurm from running jobs on the control node. > > > > sinfo shows 3 nodes configure in the slurm.conf: > > active up 2:00:00 1 down* hpc-0-5 > > active up 2:00:00 1 mix hpc-0-4 > > active up 2:00:00 1 idle hpc-0-6 > > > > > > but when I use salloc I end up on the head node > > > > > > $ salloc -N 1 -p active sh > > salloc: Granted job allocation 16 > > sh-4.1$ hostname > > hpcdev-005.sdsc.edu<http://hpcdev-005.sdsc.edu> > > > > > > That node is not part of the "active" partition but slurm still uses it. > > How? The allocation btw is for NodeList=hpc-0-4 > > and the user can login to that node without a problem but slurm doesn't > > run the sh on that node for the user. > > > > Also how can a user find out what nodes are allocated without having to > > run the scontrol command? Is there an option in salloc to return the > > host names? > > > > Thanks > > Eva > > > > > > -- > Sergio Iserte Agut, research assistant, > High Performance Computing & Architecture > Jaume I University (Castell??n, Spain) > > > > > > ------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------ > Forschungszentrum Juelich GmbH > 52425 Juelich > Sitz der Gesellschaft: Juelich > Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 > Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher > Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), > Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, > Prof. Dr. Sebastian M. Schmidt > ------------------------------------------------------------------------------------------------ > ------------------------------------------------------------------------------------------------ > >
