Quoting "Thompson, Matt[SCIENCE SYSTEMS AND APPLICATIONS INC]" <[email protected]>:

SLURM Dev List,

I was wondering if you could help me. I work on a code that for a long time required all nodes to be completely full (all 12 cores on a Westmere or all 16 on a Sandy Bridge, etc.). However, we were recently able to relax that restriction such that all but the last node need be filled.

Thus, before, all our job allocations had to be multiples of, say, 12 and 16: running on 48, 96, &c. cores. Now, though, what I'd like to do is ask for --ntasks=100 and have the nodes filled like:

  Westmere:     12+12+12+12+12+12+12+12+4
  Sandy Bridge: 16+16+16+16+16+16+8
  Haswell:      28+28+28+16

However, SLURM is actually allocating for us so that if I ask for --ntasks=100, I get out a more balanced load:

   SLURM_TASKS_PER_NODE=12,11(x8)

So I was wondering: is there a way with salloc/sbatch options to get SLURM to fill the nodes such that it fully allocates the first node, then second, and so on and leaves the last partly filled?

Thanks,
Matt
--
Matt Thompson          SSAI, Sr Software Test Engr
NASA GSFC, Global Modeling and Assimilation Office
Code 610.1, 8800 Greenbelt Rd, Greenbelt, MD 20771
Phone: 301-614-6712              Fax: 301-614-6246

Assuming that your system runs with SLurm's select/cons_res plugin, a job's allocation can include CPUs across an arbitrary number of ocmpute nodes using whatever CPUs are idle at that time. What might address your needs is submitting the job using the --exclusive option. That will make sure that your job is allocated every CPU on every node that it includes. Then you would use srun to spawn the tasks such that every CPU in the allocation is used except for some CPUs on the last node. You can do that today using a hostlist file (see the srun man page) or on a system-wide basis by configuring "SelectTypeParameters=CR_Pack_Nodes" (see the slurm.conf man page).

In the next major release of Slurm, version 15.08, will make this much simpler. You'll be able to use the srun option "--dist=pack" to pack the tasks onto the lowest number nodes in the allocation. That will be the best long-term solution and would look something like this:
$ salloc -n100 bash
$ srun -n100 --dist=pack a.out
--
Morris "Moe" Jette
CTO, SchedMD LLC
Commercial Slurm Development and Support

Reply via email to