hi All:

my job runing at a low priority partition can not be requeue when it preempted 
by a job at a hight priority partition。
my slurm.conf as below:


# -----------------------------------------------------
PreemptType=preempt/partition_prio
PreemptMode=REQUEUE
JobRequeue=1
NodeName=cp01-sys-hic-gpu-01.cp01.baidu.com,cp01-sys-hic-gpu-00.cp01.baidu.com 
CPUs=12 RealMemory=500 Sockets=2 CoresPerSocket=6 ThreadsPerCore=1 Procs=1 
State=IDLE Gres=gpu:2
PartitionName=hig   Default=NO Priority=65533 
Nodes=cp01-sys-hic-gpu-00.cp01.baidu.com  MaxTime=INFINITE State=UP
PartitionName=low  AllowGroups=ALL  Default=yes Priority=100 
Nodes=cp01-sys-hic-gpu-01.cp01.baidu.com,cp01-sys-hic-gpu-00.cp01.baidu.com 
MaxTime=INFINITE State=UP
# -----------------------------------------------------


first I run a job(job1) at Partition low, and the job run at  
cp01-sys-hic-gpu-00.cp01.baidu.com. 
then I run a job(job2) at Partition hig, and the job also run at 
cp01-sys-hic-gpu-00.cp01.baidu.com, now job1 will be preempted and have a state 
of PD. after job2 finished, job1 will be rerun.   how can I let job1 running at 
cp01-sys-hic-gpu-01.cp01.baidu.com after it preempted (after job1 preempted, 
job2 running at cp01-sys-hic-gpu-00.cp01.baidu.com in a partition hig, but now 
job1 have an idle node in partition low,  In my mind, job1 should be requeue 
and rerun at   cp01-sys-hic-gpu-01.cp01.baidu.com).


Best,

shengzhao wen

Reply via email to