Have you tried the --ntasks-per-socket or --ntasks-per-core options?
Quoting Magnus Jonsson <[email protected]>:
> Hi!
>
> I have noticed strange behaviour in the task/affinity plugin if I
> use --cpu_bind=socket and -c > 1.
>
> My task are distributed one on each socket (I have 8) and if I say
> -c 6 six of my sockets are allocated to my first task. If I have 8
> tasks each task get 6 of the 8 sockets.
>
> This sounds like a bad behaviour but is might be as design?
>
> I have traced it down to the lllp_distribution() function in
> task/affinity/dist_task.c
>
> In this switch statement:
>
> switch (req->task_dist) {
> case SLURM_DIST_BLOCK_BLOCK:
> case SLURM_DIST_CYCLIC_BLOCK:
> case SLURM_DIST_PLANE:
> /* tasks are distributed in blocks within a plane */
> rc = _task_layout_lllp_block(req, node_id, &masks);
> break;
> case SLURM_DIST_CYCLIC:
> case SLURM_DIST_BLOCK:
> case SLURM_DIST_CYCLIC_CYCLIC:
> case SLURM_DIST_BLOCK_CYCLIC:
> rc = _task_layout_lllp_cyclic(req, node_id, &masks);
> break;
> default:
> if (req->cpus_per_task > 1)
> rc = _task_layout_lllp_multi(req, node_id, &masks);
> else
> rc = _task_layout_lllp_cyclic(req, node_id, &masks);
> req->task_dist = SLURM_DIST_BLOCK_CYCLIC;
> break;
> }
>
> in the default block there is a diffrent function called if
> cpus_per_task > 1. Should the cyclic block be the same as the
> default block?
>
> Or should SLURM_DIST_CYCLIC, SLURM_DIST_BLOCK be the same as default?
>
> Best regards,
> Magnus
>
> --
> Magnus Jonsson, Developer, HPC2N, UmeƄ Universitet
>
>