[slurm-dev] SLURM 2.5.7 + GPUs

Taras Shapovalov Tue, 23 Jul 2013 11:03:58 -0700

Hi all,

We have a SLURM cluster with 2 gpus per node. There are two quite
interesting issues.
I am sending the both issues in a single email, because I guess they are
linked somehow.


ISSUE 1:

When CR_CORE_DEFAULT_DIST_BLOCK is set and gres:gpu=1 is requested by user,
then slurmctld dies with segmentation fault. When --gres:gpu=2, then it
works fine.

I found that the segfault happens in
./src/plugins/select/cons_res/dist_tasks.c:

        /*
         * If SelectTypeParameters mentions to use a block distribution for
         * cores by default, use that kind of distribution if no particular
         * cores distribution specified.
         * Note : cyclic cores distribution, which is the default, is
treated
         * by the next code block
         */
        if ( slurmctld_conf.select_type_param & CR_CORE_DEFAULT_DIST_BLOCK
) {
                switch(job_ptr->details->task_dist) {
                case SLURM_DIST_ARBITRARY:
                case SLURM_DIST_BLOCK:
                case SLURM_DIST_CYCLIC:
                case SLURM_DIST_UNKNOWN:
                        _block_sync_core_bitmap(job_ptr, cr_type);
<-------------------
                        return SLURM_SUCCESS;
                }
        }

Disabling CR_CORE_DEFAULT_DIST_BLOCK fixes the segfaults. In particular
slurmctld dies on this line:

                                sufficient = sockets_cpu_cnt[s] >= req_cpus
;

because s=3154116728 (according gdb), which, in turn, (my guess) happens
because ntasks_per_core=65535
in the same function, which looks like an integer overflow somewhere.

Stack trace is attached.


ISSUE 2:

When user requests 2 gpus, then job *always* rejected. For example:

[roman@headnode ~]$ srun -N1 -c2 -n2 --gres=gpu:2 -p k20 hostname
srun: error: Unable to allocate resources: Requested node configuration is
not available
[roman@headnode ~]$

When cons_res is enabled:

[root@headnode ~]# grep Select /etc/slurm/slurm.conf
SelectType=select/cons_res
#SelectTypeParameters=CR_Core,CR_CORE_DEFAULT_DIST_BLOCK
SelectTypeParameters=CR_Core

[root@headnode ~]# grep debug -i /etc/slurm/slurm.conf
DebugFlags=Gres,CPU_BIND,Steps
SlurmctldDebug=5
SlurmdDebug=5

then I see these errors in /var/log/slurmctld:

[2013-07-24T01:03:36+08:00] cons_res: _can_job_run_on_node: 0 cpus on
node007(0), mem 0/64000
[2013-07-24T01:03:36+08:00] cons_res: _can_job_run_on_node: 0 cpus on
node008(0), mem 0/64000

When user requests 1 gpu per node, then it works fine:

[2013-07-24T01:11:59+08:00] cons_res: _can_job_run_on_node: 8 cpus on
node007(0), mem 0/1
[2013-07-24T01:11:59+08:00] cons_res: _can_job_run_on_node: 8 cpus on
node008(0), mem 0/1

When cons_res is disabled, but 2 gpus are requested I see:

[2013-07-24T01:17:56+08:00] gres: gpu state for job 3623
[2013-07-24T01:17:56+08:00]   gres_cnt:2 node_cnt:0
[2013-07-24T01:17:56+08:00] _pick_best_nodes: job 3623 never runnable
[2013-07-24T01:17:56+08:00] debug:  (node_scheduler.c:165) job id: 3623 --
No nodes in bitmap of job_record!
[2013-07-24T01:17:56+08:00] debug:  (node_scheduler.c:1785) job id: 3623 --
job_record->gres: (gpu:2), job_record->gres_alloc: ()
[2013-07-24T01:17:56+08:00] debug:  (node_scheduler.c:1687) job id: 3623 --
job_record->gres: (gpu:2), job_record->gres_alloc: ()
[2013-07-24T01:17:56+08:00] _slurm_rpc_allocate_resources: Requested node
configuration is not available

Nodes are configured this way:

NodeName=node008 Arch=x86_64 CoresPerSocket=8
   CPUAlloc=0 CPUErr=0 CPUTot=16 CPULoad=0.00 Features=(null)
   Gres=gpu:2
   NodeAddr=node008 NodeHostName=node008
   OS=Linux RealMemory=64000 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1
   BootTime=2013-07-23T00:31:38 SlurmdStartTime=2013-07-24T00:07:13
   CurrentWatts=0 LowestJoules=0 ConsumedJoules=0

Each /etc/slurm/gres.conf contains these lines:

Name=gpu File=/dev/nvidia0 CPUs=0-7
Name=gpu File=/dev/nvidia1 CPUs=8-15

This issue can also be related on
https://groups.google.com/forum/#!topic/slurm-devel/N5j1AjAbsbw
but disabling CPU binding does not help.

Any ideas about this puzzle are highly appropriated!

Best regards,
Taras

bt_slurm.log
Description: Binary data

[slurm-dev] SLURM 2.5.7 + GPUs

Reply via email to