hi out cluster is setup with the configuration below. yet we have been having a lot of jobs cancelled when preempted:
slurmd[node004]: *** JOB 79188 CANCELLED AT 2014-08-05T15:31:41 DUE TO PREEMPTION *** i thought the settings would simply suspend the job instead of canceling it. cheers, satra Partial configuration --------------------------- PreemptMode=GANG,SUSPEND PreemptType=preempt/partition_prio # default SchedulerTimeSlice=30 DefMemPerCPU=2048 DefMemPerNode=2048 PartitionName=DEFAULT MaxTime=7-0 DefaultTime=24:00:00 # Partitions PartitionName=defq Default=NO MinNodes=1 DefaultTime=1-00:00:00 MaxTime=7-00:00:00 AllowGroups=ALL Priority=1 DisableRootJobs=NO RootOnly=NO Hidden=YES Shared=NO GraceTime=0 ReqResv=NO PreemptMode=GANG,SUSPEND State=UP PartitionName=om_all_nodes Default=YES MinNodes=1 DefaultTime=1-00:00:00 MaxTime=7-00:00:00 AllowGroups=ALL Priority=1 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=FORCE:4 GraceTime=0 ReqResv=NO PreemptMode=GANG,SUSPEND State=UP Nodes=node[001-030] PartitionName=om_interactive Default=NO MinNodes=1 MaxNodes=1 DefaultTime=01:00:00 MaxTime=01:00:00 AllowGroups=ALL Priority=10 DisableRootJobs=NO RootOnly=NO Hidden=NO Shared=FORCE:1 GraceTime=0 MaxCPUsPerNode=32 ReqResv=NO PreemptMode=GANG,SUSPEND State=UP Nodes=node017
