Update:

If not only CPUs are removed from gres.conf, but also Procs from NodeName
in slurm.conf, then the second issue is gone. But this is not a solution.

Any ideas?

Best regards,
Taras

On Tue, Jul 23, 2013 at 8:00 PM, Taras Shapovalov <
[email protected]> wrote:

> Hi all,
>
> We have a SLURM cluster with 2 gpus per node. There are two quite
> interesting issues.
> I am sending the both issues in a single email, because I guess they are
> linked somehow.
>
> ISSUE 1:
>
> When CR_CORE_DEFAULT_DIST_BLOCK is set and gres:gpu=1 is requested by
> user,
> then slurmctld dies with segmentation fault. When --gres:gpu=2, then it
> works fine.
>
> I found that the segfault happens in
> ./src/plugins/select/cons_res/dist_tasks.c:
>
>         /*
>          * If SelectTypeParameters mentions to use a block distribution for
>          * cores by default, use that kind of distribution if no particular
>          * cores distribution specified.
>          * Note : cyclic cores distribution, which is the default, is
> treated
>          * by the next code block
>          */
>         if ( slurmctld_conf.select_type_param & CR_CORE_DEFAULT_DIST_BLOCK
> ) {
>                 switch(job_ptr->details->task_dist) {
>                 case SLURM_DIST_ARBITRARY:
>                 case SLURM_DIST_BLOCK:
>                 case SLURM_DIST_CYCLIC:
>                 case SLURM_DIST_UNKNOWN:
>                         _block_sync_core_bitmap(job_ptr, cr_type);
> <-------------------
>                         return SLURM_SUCCESS;
>                 }
>         }
>
> Disabling CR_CORE_DEFAULT_DIST_BLOCK fixes the segfaults. In particular
> slurmctld dies on this line:
>
>                                 sufficient = sockets_cpu_cnt[s] >=
> req_cpus ;
>
> because s=3154116728 (according gdb), which, in turn, (my guess) happens
> because ntasks_per_core=65535
> in the same function, which looks like an integer overflow somewhere.
>
> Stack trace is attached.
>
>
> ISSUE 2:
>
> When user requests 2 gpus, then job *always* rejected. For example:
>
> [roman@headnode ~]$ srun -N1 -c2 -n2 --gres=gpu:2 -p k20 hostname
> srun: error: Unable to allocate resources: Requested node configuration is
> not available
> [roman@headnode ~]$
>
> When cons_res is enabled:
>
> [root@headnode ~]# grep Select /etc/slurm/slurm.conf
> SelectType=select/cons_res
> #SelectTypeParameters=CR_Core,CR_CORE_DEFAULT_DIST_BLOCK
> SelectTypeParameters=CR_Core
>
> [root@headnode ~]# grep debug -i /etc/slurm/slurm.conf
> DebugFlags=Gres,CPU_BIND,Steps
> SlurmctldDebug=5
> SlurmdDebug=5
>
> then I see these errors in /var/log/slurmctld:
>
> [2013-07-24T01:03:36+08:00] cons_res: _can_job_run_on_node: 0 cpus on
> node007(0), mem 0/64000
> [2013-07-24T01:03:36+08:00] cons_res: _can_job_run_on_node: 0 cpus on
> node008(0), mem 0/64000
>
> When user requests 1 gpu per node, then it works fine:
>
> [2013-07-24T01:11:59+08:00] cons_res: _can_job_run_on_node: 8 cpus on
> node007(0), mem 0/1
> [2013-07-24T01:11:59+08:00] cons_res: _can_job_run_on_node: 8 cpus on
> node008(0), mem 0/1
>
> When cons_res is disabled, but 2 gpus are requested I see:
>
> [2013-07-24T01:17:56+08:00] gres: gpu state for job 3623
> [2013-07-24T01:17:56+08:00]   gres_cnt:2 node_cnt:0
> [2013-07-24T01:17:56+08:00] _pick_best_nodes: job 3623 never runnable
> [2013-07-24T01:17:56+08:00] debug:  (node_scheduler.c:165) job id: 3623 --
> No nodes in bitmap of job_record!
> [2013-07-24T01:17:56+08:00] debug:  (node_scheduler.c:1785) job id: 3623
> -- job_record->gres: (gpu:2), job_record->gres_alloc: ()
> [2013-07-24T01:17:56+08:00] debug:  (node_scheduler.c:1687) job id: 3623
> -- job_record->gres: (gpu:2), job_record->gres_alloc: ()
> [2013-07-24T01:17:56+08:00] _slurm_rpc_allocate_resources: Requested node
> configuration is not available
>
> Nodes are configured this way:
>
> NodeName=node008 Arch=x86_64 CoresPerSocket=8
>    CPUAlloc=0 CPUErr=0 CPUTot=16 CPULoad=0.00 Features=(null)
>    Gres=gpu:2
>    NodeAddr=node008 NodeHostName=node008
>    OS=Linux RealMemory=64000 Sockets=2 Boards=1
>    State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1
>    BootTime=2013-07-23T00:31:38 SlurmdStartTime=2013-07-24T00:07:13
>    CurrentWatts=0 LowestJoules=0 ConsumedJoules=0
>
> Each /etc/slurm/gres.conf contains these lines:
>
> Name=gpu File=/dev/nvidia0 CPUs=0-7
> Name=gpu File=/dev/nvidia1 CPUs=8-15
>
> This issue can also be related on
> https://groups.google.com/forum/#!topic/slurm-devel/N5j1AjAbsbw
> but disabling CPU binding does not help.
>
> Any ideas about this puzzle are highly appropriated!
>
> Best regards,
> Taras
>
>
>

Reply via email to