Hello Ulf,
I have updated the --overcommit documentation as shown below. The
--ntasks-per-core and socket apply only to the job allocation and are
ignored for job steps. Proobably the best solution would be to create
a job allocation of the desired size using the salloc command and then
execute srun with the desired task count. There is an environment
variable SLURM_JOB_CPUS_PER_NODE set to the CPU count on each node,
but the definition of "CPU" depends upon your configuration and could
be a core or hyperthread count. Something like this should work in
your environment:
salloc -n8 -N1 srun -n16 -O a.out
--ntasks-per-core=<ntasks>
Request the maximum ntasks be invoked on each core. This option
applies to the job allocation, but not to step allocations.
Meant to be used with the --ntasks option. Related to
--ntasks-per-node except at the core level instead of the node
level. Masks will automatically be generated to bind the tasks
to specific core unless --cpu_bind=none is specified. NOTE:
This option is not supported unless SelectTypeParameters=CR_Core
or SelectTypeParameters=CR_Core_Memory is configured.
-O, --overcommit
Overcommit resources. When applied to job allocation, only one
CPU is allocated to the job per node and options used to specify
the number of tasks per node, socket, core, etc. are ignored.
When applied to job step allocations (the srun command when exe‐
cuted within an existing job allocation), this option can be
used to launch more than one task per CPU. Normally, srun will
not allocate more than one process per CPU. By specifying
--overcommit you are explicitly allowing more than one process
per CPU. However no more than MAX_TASKS_PER_NODE tasks are per‐
mitted to execute per node. NOTE: MAX_TASKS_PER_NODE is defined
in the file slurm.h and is not a variable, it is set at SLURM
build time.
Quoting Ulf Markwardt <[email protected]>:
Hello Moe,
That is exactly what the overcommit option is designed to do.
when I do
salloc -p sandy --overcommit --ntasks-per-node=16
--ntasks-per-core=2 -t 10
all tasks run on a single core:
Hm, thats a bit confusing for me. It looks like "--overcommit" and
"--ntasks-per-core=2" do not work together.
Is there a way to tell SLURM that I want to run exactly two
processes per core?
Thanks
Ulf
--
___________________________________________________________________
Dr. Ulf Markwardt
Technische Universität Dresden
Center for Information Services and High Performance Computing (ZIH)
01062 Dresden, Germany
Phone: (+49) 351/463-33640 WWW: http://www.tu-dresden.de/zih