Hi, could you try submitting the following script:
Script job.sh:
******************************
#!/bin/bash
#SBATCH -p test-low
#SBATCH -n 3
#SBATCH -t 12:00:00
sig_term()
{
echo "function sig_term called. Exiting"
echo 'sig_term' > slask_term
echo $(date) >> slask_term
}
# associate the function "term_handler" with the TERM signal
trap 'sig_term' SIGTERM
sleep 1000 &
wait $!
******************************
and see if you catch the first SIGTERM. When I tried this signal was
ONLY caught at the end of the grace time.
(I'll try your settings as soon as my system is up again) Regards, /jon On 11/20/2017 04:21 PM, Ailing Zhang wrote:
Hi slurm community,I'm testing preemption with partition based preemption. Partitions test-high and test-low share the same nodes. I set GraceTime=600 and PreemptMode=CANCEL in test-low. But once I submitted a job to test-high, job in test-low is immediately killed without any grace time.Here is my configs. PartitionName=test-low AllowGroups=admins AllowAccounts=ALL AllowQos=ALL AllocNodes=ALL Default=NO QoS=N/ADefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=600 Hidden=NO MaxNodes=UNLIMITED MaxTime=02:00:00 MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITEDNodes=node[100-102]PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=NOOverTimeLimit=NONE PreemptMode=CANCEL State=UP TotalCPUs=100 TotalNodes=3 SelectTypeParameters=NONE DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED PartitionName=test-high AllowGroups=admins AllowAccounts=ALL AllowQos=ALL AllocNodes=ALL Default=NO QoS=N/ADefaultTime=NONE DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO MaxNodes=UNLIMITED MaxTime=02:00:00 MinNodes=1 LLN=NO MaxCPUsPerNode=UNLIMITED Nodes=node[100-102] PriorityJobFactor=30 PriorityTier=30 RootOnly=NO ReqResv=NO OverSubscribe=NOOverTimeLimit=NONE PreemptMode=OFF State=UP TotalCPUs=100 TotalNodes=3 SelectTypeParameters=NONE DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED Any help will be much appreciated. Thanks! Ailing
