I think I misunderstood what you were looking for.

You probably want to investigate the -m / --distribution option.
Default is "block", which means with a loop like for i in {1..30}; do
        srun -n1 job &
done

inside a job which was allocated nodes torus1, torus2, torus3,
the scheduler is going to use up all CPU cores on the first node,
then start filling up second node, then third, etc.  If these
have 20 cores each, you will end up with 20 job steps on torus1
and 10 on torus2.  If they have only 10 cores, you should end up
with 10 on each of the three.  (NOTE: "cores" might or might not
include "hyperthreaded cores" depending on a number of factors).

If you use -m cyclic, it should assign first job step to torus1,
second to torus2, third to torus3, ... 3*N+i to torus$i, ...

You could in theory also do this manually by setting the argument
of -r successively to 1,2,3,1,2,3,... for successive iterations
of the loop, but -m cyclic is probably better.


On Fri, 9 Dec 2016, Nigella Sanders wrote:

Thank you for your suggestion Thomas,

I have just tested it but no luck.
There is a shift in the node list to be used but only one of them is accessed.

For example, with an allocation of three nodes ((torus6001, torus6002 and 
torus6003), this code will confine the 30 tasks in torus6002 instead of using 
the two last ones:

for i in {1..30};do
         touch data.$i
         srun -N1 -n1 -r1 --mem=15 --input=data.$i ./get_cpu &
done
wait

Setting -r2 will confine al the tasks in torus6003, as expected.

Regards,
Nigella







2016-12-08 15:27 GMT+01:00 Thomas M. Payerle <paye...@umd.edu>:

      I've not used it, but when looking at srun man page (for something else)
      notices a -r / --relative option which sounds like it might be what you
      are looking for.

      On Thu, 8 Dec 2016, Nigella Sanders wrote:

            Hi Michael,

            Actually, that was my first try but it didn't work. ?
            srun finds it inconsistent with "-N -n1" and ends up using only the 
first node provided in the -w list.

            $ srun: Warning: can't run 1 processes on 2 nodes, setting nnodes 
to 1


            Regards,
            Nigella



            2016-12-08 13:04 GMT+00:00 Michael Di Domenico 
<mdidomeni...@gmail.com>:

                  On Thu, Dec 8, 2016 at 5:48 AM, Nigella Sanders
                  <nigella.sand...@gmail.com> wrote:
                  >
                  > All 30 tasks run always in the first two allocated nodes 
(torus6001 and
                  > torus6002).
                  >
                  > However, I would like to get these tasks using only the 
second and then
                  > third nodes (torus6002 and torus6003).
                  > Does anyone an idea about how to do this?

                  I've not tested this, but i believe you can add the -w option 
to the
                  srun inside your sbatch script





      Tom Payerle
      IT-ETI-EUS                              paye...@umd.edu
      4254 Stadium Dr                         (301) 405-6135
      University of Maryland
      College Park, MD 20742-4111





Tom Payerle
IT-ETI-EUS                              paye...@umd.edu
4254 Stadium Dr                         (301) 405-6135
University of Maryland
College Park, MD 20742-4111

Reply via email to