I think I misunderstood what you were looking for.
You probably want to investigate the -m / --distribution option.
Default is "block", which means with a loop like
for i in {1..30}; do
srun -n1 job &
done
inside a job which was allocated nodes torus1, torus2, torus3,
the scheduler is going to use up all CPU cores on the first node,
then start filling up second node, then third, etc. If these
have 20 cores each, you will end up with 20 job steps on torus1
and 10 on torus2. If they have only 10 cores, you should end up
with 10 on each of the three. (NOTE: "cores" might or might not
include "hyperthreaded cores" depending on a number of factors).
If you use -m cyclic, it should assign first job step to torus1,
second to torus2, third to torus3, ... 3*N+i to torus$i, ...
You could in theory also do this manually by setting the argument
of -r successively to 1,2,3,1,2,3,... for successive iterations
of the loop, but -m cyclic is probably better.
On Fri, 9 Dec 2016, Nigella Sanders wrote:
Thank you for your suggestion Thomas,
I have just tested it but no luck.
There is a shift in the node list to be used but only one of them is accessed.
For example, with an allocation of three nodes ((torus6001, torus6002 and
torus6003), this code will confine the 30 tasks in torus6002 instead of using
the two last ones:
for i in {1..30};do
touch data.$i
srun -N1 -n1 -r1 --mem=15 --input=data.$i ./get_cpu &
done
wait
Setting -r2 will confine al the tasks in torus6003, as expected.
Regards,
Nigella
2016-12-08 15:27 GMT+01:00 Thomas M. Payerle <paye...@umd.edu>:
I've not used it, but when looking at srun man page (for something else)
notices a -r / --relative option which sounds like it might be what you
are looking for.
On Thu, 8 Dec 2016, Nigella Sanders wrote:
Hi Michael,
Actually, that was my first try but it didn't work. ?
srun finds it inconsistent with "-N -n1" and ends up using only the
first node provided in the -w list.
$ srun: Warning: can't run 1 processes on 2 nodes, setting nnodes
to 1
Regards,
Nigella
2016-12-08 13:04 GMT+00:00 Michael Di Domenico
<mdidomeni...@gmail.com>:
On Thu, Dec 8, 2016 at 5:48 AM, Nigella Sanders
<nigella.sand...@gmail.com> wrote:
>
> All 30 tasks run always in the first two allocated nodes
(torus6001 and
> torus6002).
>
> However, I would like to get these tasks using only the
second and then
> third nodes (torus6002 and torus6003).
> Does anyone an idea about how to do this?
I've not tested this, but i believe you can add the -w option
to the
srun inside your sbatch script
Tom Payerle
IT-ETI-EUS paye...@umd.edu
4254 Stadium Dr (301) 405-6135
University of Maryland
College Park, MD 20742-4111
Tom Payerle
IT-ETI-EUS paye...@umd.edu
4254 Stadium Dr (301) 405-6135
University of Maryland
College Park, MD 20742-4111