[slurm-dev] RE: Ressouces allocation problem

David Roman Mon, 15 Feb 2016 07:45:12 -0800

I have not the choice. GANG is required for SUSPEND. Because GANG plugin 
include the SUSPEND mechanism (If I understand the documentation)



Bellow a part of my log (option -Dvvvv)
I didn't see any error

slurmctld: debug3: Processing RPC: REQUEST_NODE_INFO from uid=11239
slurmctld: debug3: Processing RPC: REQUEST_JOB_INFO from uid=11239
slurmctld: debug3: _slurm_rpc_dump_jobs, no change
slurmctld: debug2: Processing RPC: REQUEST_SUBMIT_BATCH_JOB from uid=11239
slurmctld: debug3: JobDesc: user_id=11239 job_id=N/A partition=LDEV 
name=test_ldev
slurmctld: debug3:    cpus=32-4294967294 pn_min_cpus=-1 core_spec=-1
slurmctld: debug3:    -N min-[max]: 4294967294-[4294967294]:65534:65534:65534
slurmctld: debug3:    pn_min_memory_job=500 pn_min_tmp_disk=-1
slurmctld: debug3:    immediate=0 features=(null) reservation=(null)
slurmctld: debug3:    req_nodes=(null) exc_nodes=(null) gres=(null)
slurmctld: debug3:    time_limit=10-10 priority=-1 contiguous=0 shared=0
slurmctld: debug3:    kill_on_node_fail=-1 script=#!/bin/bash

#SBATCH --ntasks=32
#SBATCH...
slurmctld: debug3:    argv="/NOVELTIS/roman/tmp2/slurm/test_ldev"
slurmctld: debug3:    
environment=MKLROOT=/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/composer_xe_2015.1.133/mkl,MANPATH=/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/impi/5.0.2.044/man:/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/composer_xe_2015.1.133/man/en_US:/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/composer_xe_2015.1.133/man/en_US:/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/composer_xe_2015.1.133/debugger/gdb/intel64/share/man/:/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/composer_xe_2015.1.133/debugger/gdb/intel64_mic/share/man/:,HOSTNAME=slurm.noveltis.loc,...
slurmctld: debug3:    stdin=/dev/null 
stdout=/NOVELTIS/roman/tmp2/slurm/DADA-%j.out 
stderr=/NOVELTIS/roman/tmp2/slurm/DADA-%j.err
slurmctld: debug3:    work_dir=/NOVELTIS/roman/tmp2/slurm 
alloc_node:sid=slurm:3082
slurmctld: debug3:    sicp_mode=0 power_flags=
slurmctld: debug3:    resp_host=(null) alloc_resp_port=0 other_port=0
slurmctld: debug3:    dependency=(null) account=(null) qos=(null) comment=(null)
slurmctld: debug3:    mail_type=271 [email protected] nice=0 
num_tasks=32 open_mode=0 overcommit=-1 acctg_freq=(null)
slurmctld: debug3:    network=(null) begin=Unknown cpus_per_task=-1 requeue=-1 
licenses=(null)
slurmctld: debug3:    end_time= signal=0@0 wait_all_nodes=-1 cpu_freq=
slurmctld: debug3:    ntasks_per_node=-1 ntasks_per_socket=-1 ntasks_per_core=-1
slurmctld: debug3:    mem_bind=65534:(null) plane_size:65534
slurmctld: debug3:    array_inx=(null)
slurmctld: debug3:    burst_buffer=(null)
slurmctld: debug3: before alteration asking for nodes 1-4294967294 cpus 
32-4294967294
slurmctld: debug3: after alteration asking for nodes 1-4294967294 cpus 
32-4294967294
slurmctld: debug2: found 2 usable nodes from config containing hpc-node[1-2]
slurmctld: debug3: _pick_best_nodes: job 60 idle_nodes 2 share_nodes 2
slurmctld: debug2: select_p_job_test for job 60
slurmctld: debug3: powercapping: checking job 60 : skipped, capping disabled
slurmctld: debug2: sched: JobId=60 allocated resources: NodeList=(null)
slurmctld: _slurm_rpc_submit_batch_job JobId=60 usec=731
slurmctld: debug3: Writing job id 60 to header record of job_state file
slurmctld: debug3: Processing RPC: REQUEST_NODE_INFO from uid=11239
slurmctld: debug3: _slurm_rpc_dump_nodes, no change
slurmctld: debug3: Processing RPC: REQUEST_JOB_INFO from uid=11239
slurmctld: debug2: Testing job time limits and checkpoints
slurmctld: debug:  sched: Running job scheduler
slurmctld: debug2: found 2 usable nodes from config containing hpc-node[1-2]
slurmctld: debug3: _pick_best_nodes: job 60 idle_nodes 2 share_nodes 2
slurmctld: debug2: select_p_job_test for job 60
slurmctld: debug3: powercapping: checking job 60 : skipped, capping disabled
slurmctld: debug3: cons_res: _add_job_to_res: job 60 act 0 
slurmctld: debug3: cons_res: adding job 60 to part LDEV row 0
slurmctld: email msg to [email protected]: SLURM Job_id=60 Name=test_ldev 
Began, Queued time 00:00:01
slurmctld: debug3: sched: JobId=60 initiated
slurmctld: sched: Allocate JobID=60 NodeList=hpc-node[1-2] #CPUs=32
slurmctld: debug2: Spawning RPC agent for msg_type REQUEST_BATCH_JOB_LAUNCH
slurmctld: debug2: Performing full system state save
slurmctld: debug2: got 1 threads to send out
slurmctld: debug2: Tree head got back 0 looking for 1
slurmctld: debug3: Tree sending to hpc-node1
slurmctld: debug4: orig_timeout was 10000 we have 0 steps and a timeout of 10000
slurmctld: debug2: Tree head got back 1
slurmctld: debug2: Tree head got back 1
slurmctld: debug2: node_did_resp hpc-node1
slurmctld: debug:  backfill: beginning
slurmctld: debug:  backfill: no jobs to backfill
slurmctld: debug3: Processing RPC: REQUEST_JOB_INFO from uid=11239
slurmctld: debug2: Processing RPC: REQUEST_PARTITION_INFO uid=11239
slurmctld: debug2: _slurm_rpc_dump_partitions, size=590 usec=50
slurmctld: debug3: Writing job id 60 to header record of job_state file
slurmctld: debug3: Processing RPC: REQUEST_NODE_INFO from uid=11239
slurmctld: debug3: Processing RPC: REQUEST_JOB_INFO from uid=11239
slurmctld: debug2: Processing RPC: REQUEST_SUBMIT_BATCH_JOB from uid=11239
slurmctld: debug3: JobDesc: user_id=11239 job_id=N/A partition=LALL 
name=test_lall
slurmctld: debug3:    cpus=16-4294967294 pn_min_cpus=-1 core_spec=-1
slurmctld: debug3:    -N min-[max]: 4294967294-[4294967294]:65534:65534:65534
slurmctld: debug3:    pn_min_memory_job=500 pn_min_tmp_disk=-1
slurmctld: debug3:    immediate=0 features=(null) reservation=(null)
slurmctld: debug3:    req_nodes=(null) exc_nodes=(null) gres=(null)
slurmctld: debug3:    time_limit=10-10 priority=-1 contiguous=0 shared=0
slurmctld: debug3:    kill_on_node_fail=-1 script=#!/bin/bash

#SBATCH --ntasks=16
#SBATCH...
slurmctld: debug3:    argv="/NOVELTIS/roman/tmp2/slurm/test_lall"
slurmctld: debug3:    
environment=MKLROOT=/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/composer_xe_2015.1.133/mkl,MANPATH=/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/impi/5.0.2.044/man:/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/composer_xe_2015.1.133/man/en_US:/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/composer_xe_2015.1.133/man/en_US:/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/composer_xe_2015.1.133/debugger/gdb/intel64/share/man/:/PROJETS/LIBRARIES_AND_TOOLS/tools/Intel/composer_xe_2015.1.133/debugger/gdb/intel64_mic/share/man/:,HOSTNAME=slurm.noveltis.loc,...
slurmctld: debug3:    stdin=/dev/null 
stdout=/NOVELTIS/roman/tmp2/slurm/DADA-%j.out 
stderr=/NOVELTIS/roman/tmp2/slurm/DADA-%j.err
slurmctld: debug3:    work_dir=/NOVELTIS/roman/tmp2/slurm 
alloc_node:sid=slurm:3082
slurmctld: debug3:    sicp_mode=0 power_flags=
slurmctld: debug3:    resp_host=(null) alloc_resp_port=0 other_port=0
slurmctld: debug3:    dependency=(null) account=(null) qos=(null) comment=(null)
slurmctld: debug3:    mail_type=271 [email protected] nice=0 
num_tasks=16 open_mode=0 overcommit=-1 acctg_freq=(null)
slurmctld: debug3:    network=(null) begin=Unknown cpus_per_task=-1 requeue=-1 
licenses=(null)
slurmctld: debug3:    end_time= signal=0@0 wait_all_nodes=-1 cpu_freq=
slurmctld: debug3:    ntasks_per_node=-1 ntasks_per_socket=-1 ntasks_per_core=-1
slurmctld: debug3:    mem_bind=65534:(null) plane_size:65534
slurmctld: debug3:    array_inx=(null)
slurmctld: debug3:    burst_buffer=(null)
slurmctld: debug3: before alteration asking for nodes 1-4294967294 cpus 
16-4294967294
slurmctld: debug3: after alteration asking for nodes 1-4294967294 cpus 
16-4294967294
slurmctld: debug2: found 1 usable nodes from config containing hpc-node[1-2]
slurmctld: debug3: _pick_best_nodes: job 61 idle_nodes 0 share_nodes 0
slurmctld: debug2: select_p_job_test for job 61
slurmctld: debug3: powercapping: checking job 61 : skipped, capping disabled
slurmctld: debug2: sched: JobId=61 allocated resources: NodeList=(null)
slurmctld: _slurm_rpc_submit_batch_job JobId=61 usec=1522
slurmctld: debug3: Writing job id 61 to header record of job_state file
slurmctld: debug:  sched: Running job scheduler
slurmctld: debug2: found 1 usable nodes from config containing hpc-node[1-2]
slurmctld: debug3: _pick_best_nodes: job 61 idle_nodes 0 share_nodes 0
slurmctld: debug2: select_p_job_test for job 61
slurmctld: debug3: powercapping: checking job 61 : skipped, capping disabled
slurmctld: debug3: cons_res: _add_job_to_res: job 61 act 0 
slurmctld: debug3: cons_res: adding job 61 to part LALL row 0
slurmctld: email msg to [email protected]: SLURM Job_id=61 Name=test_lall 
Began, Queued time 00:00:01
slurmctld: debug3: sched: JobId=61 initiated
slurmctld: sched: Allocate JobID=61 NodeList=hpc-node2 #CPUs=16
slurmctld: debug2: Spawning RPC agent for msg_type REQUEST_BATCH_JOB_LAUNCH
slurmctld: debug2: Performing full system state save
slurmctld: debug2: got 1 threads to send out
slurmctld: debug2: Tree head got back 0 looking for 1
slurmctld: debug3: Tree sending to hpc-node2
slurmctld: debug4: orig_timeout was 10000 we have 0 steps and a timeout of 10000
slurmctld: debug2: Tree head got back 1
slurmctld: debug2: Tree head got back 1
slurmctld: debug2: node_did_resp hpc-node2
slurmctld: debug3: Processing RPC: REQUEST_JOB_INFO from uid=11239
slurmctld: debug2: Processing RPC: REQUEST_PARTITION_INFO uid=11239
slurmctld: debug2: _slurm_rpc_dump_partitions, size=590 usec=166
slurmctld: debug3: Processing RPC: REQUEST_NODE_INFO from uid=11239
slurmctld: debug3: Processing RPC: REQUEST_JOB_INFO from uid=11239
slurmctld: debug3: Writing job id 61 to header record of job_state file
slurmctld: debug3: Processing RPC: REQUEST_NODE_INFO from uid=11239
slurmctld: debug3: _slurm_rpc_dump_nodes, no change
slurmctld: debug3: Processing RPC: REQUEST_JOB_INFO from uid=11239
slurmctld: debug3: _slurm_rpc_dump_jobs, no change
slurmctld: debug3: Processing RPC: REQUEST_NODE_INFO from uid=11239
slurmctld: debug3: _slurm_rpc_dump_nodes, no change
slurmctld: debug3: Processing RPC: REQUEST_JOB_INFO from uid=11239
slurmctld: debug3: _slurm_rpc_dump_jobs, no change
slurmctld: debug:  sched: Running job scheduler
slurmctld: Warning: Note very large processing time from schedule: usec=3460827 
began=15:13:33.301
slurmctld: Warning: Note very large processing time from _slurmctld_background: 
usec=28284088 began=15:13:08.477
slurmctld: debug3: Processing RPC: REQUEST_JOB_INFO from uid=11239
slurmctld: debug3: _slurm_rpc_dump_jobs, no change
slurmctld: debug2: slurm_send_timeout: Socket no longer there
slurmctld: debug3: slurm_msg_sendto: peer has disappeared for msg_type=8001
slurmctld: debug3: Processing RPC: REQUEST_NODE_INFO from uid=11239
slurmctld: debug3: Processing RPC: REQUEST_JOB_INFO from uid=11239
slurmctld: debug3: Processing RPC: REQUEST_NODE_INFO from uid=11239
slurmctld: debug3: _slurm_rpc_dump_nodes, no change
slurmctld: debug3: Processing RPC: REQUEST_JOB_INFO from uid=11239
slurmctld: debug3: _slurm_rpc_dump_jobs, no change




-----Message d'origine-----
De : Diego Zuccato [mailto:[email protected]] 
Envoyé : lundi 15 février 2016 14:47
À : slurm-dev <[email protected]>
Objet : [slurm-dev] RE: Ressouces allocation problem


Il 15/02/2016 12:55, David Roman ha scritto:

> What I want :
> OPA have a biger priority than other partitions. OPA cant preempt jobs 
> submitted in partitions LDEV, LOP, LALL.
> LDEV, LOP, LALL have the same priority.
> LDEV can't suspend jobs submitted in LOP or LALL.
> LOP and LALL can't suspend jobs submitted in LDEV.
Ok.

> In practice:
> 1- I submit a job A in LOP : A running (it is ok for me)
> 2- I submit a job B in LALL : A and B running (it is ok for me)
> 3- I submit a job C in LDEV : A and B and C running (it is not ok for 
> me) For me, the job C must be in PENDING state !!!
Yes, it should be since there are no consumables left.

Another thing that could help you pinpoint the error is enabling scheduler 
verbose logging.

IMVHO it's due to your use of PreempMode, that includes GANG:
GANG
    enables gang scheduling (time slicing) of jobs in the same partition. NOTE: 
Gang scheduling is performed independently for each partition, so configuring 
partitions with overlapping nodes and gang scheduling is generally not 
recommended.

But to use SUSPEND it seems you have to use GANG too... I don't know more, 
sorry. Maybe some expert can give a definitive answer.

--
Diego Zuccato
Servizi Informatici
Dip. di Fisica e Astronomia (DIFA) - Università di Bologna V.le Berti-Pichat 
6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786
mail: [email protected]

[slurm-dev] RE: Ressouces allocation problem

Reply via email to