Hi!

For us this is in most cases a bad default behaviour and I have not found a way to set a default value either (other then changing the code and recompile).

One other thing that I'm still curious about is then will the default:-statement of the switch/case in my first mail (see below) happen?

task_dist seems to be initialized to SLURM_DIST_CYCLIC (1) and all the other cases of task_dist that comes to this point are defined in the switch/case.

I have not bin able to reach it and activate the _task_layout_lllp_multi() function.

/Magnus

On 2013-02-15 19:52, [email protected] wrote:
Assuming you're using the default allocation and distribution methods,
the behavior you describe sounds correct.  Available cpus will be
selected cyclically across the sockets for allocation to the job.
  Allocated cpus will be selected cyclically across the sockets for
distribution to tasks for binding.  And each task will be bound to all
of the allocated cpus on each socket from which a cpu was distributed to
it. For -n 8 -c 6, I would expect each of the 8 tasks to be bound to 36
cpus (6 cpus on each of 6 sockets).

See the CPU Management Guide in the Slurm documentation for more info.
  Examples 11 thru 13 illustrate socket binding.

Martin Perry
Bull Phoenix



From: Moe Jette <[email protected]>
To: "slurm-dev" <[email protected]>,
Date: 02/15/2013 10:33 AM
Subject: [slurm-dev] Re: task/affinity, --cpu_bind=socket and -c > 1
------------------------------------------------------------------------




Have you tried the --ntasks-per-socket or --ntasks-per-core options?

Quoting Magnus Jonsson <[email protected]>:

 > Hi!
 >
 > I have noticed strange behaviour in the task/affinity plugin if I
 > use --cpu_bind=socket and -c > 1.
 >
 > My task are distributed one on each socket (I have 8) and if I say
 > -c 6 six of my sockets are allocated to my first task. If I have 8
 > tasks each task get 6 of the 8 sockets.
 >
 > This sounds like a bad behaviour but is might be as design?
 >
 > I have traced it down to the lllp_distribution() function in
 > task/affinity/dist_task.c
 >
 > In this switch statement:
 >
 >                  switch (req->task_dist) {
 >                  case SLURM_DIST_BLOCK_BLOCK:
 >                  case SLURM_DIST_CYCLIC_BLOCK:
 >                  case SLURM_DIST_PLANE:
 > /* tasks are distributed in blocks within a plane */
 > rc = _task_layout_lllp_block(req, node_id, &masks);
 > break;
 >                  case SLURM_DIST_CYCLIC:
 >                  case SLURM_DIST_BLOCK:
 >                  case SLURM_DIST_CYCLIC_CYCLIC:
 >                  case SLURM_DIST_BLOCK_CYCLIC:
 > rc = _task_layout_lllp_cyclic(req, node_id, &masks);
 > break;
 >                  default:
 > if (req->cpus_per_task > 1)
 >                  rc = _task_layout_lllp_multi(req, node_id, &masks);
 > else
 >                  rc = _task_layout_lllp_cyclic(req, node_id, &masks);
 > req->task_dist = SLURM_DIST_BLOCK_CYCLIC;
 > break;
 >                  }
 >
 > in the default block there is a diffrent function called if
 > cpus_per_task > 1. Should the cyclic block be the same as the
 > default block?
 >
 > Or should SLURM_DIST_CYCLIC, SLURM_DIST_BLOCK be the same as default?
 >
 > Best regards,
 > Magnus
 >
 > --
 > Magnus Jonsson, Developer, HPC2N, Umeå Universitet
 >
 >




--
Magnus Jonsson, Developer, HPC2N, Umeå Universitet

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to