Hi Pär,
This one line patch should fix it:
https://github.com/SchedMD/slurm/commit/3de1494694b24a99f9294c020f8e85f474119698
Quoting [email protected]:
Hi,
We are installing a new cluster with Slurm 14.11.3. I am currently
updating and testing our various scripts (job_submit.lua,
PrologSlurmctld, Epilog...), as most of our other clusters are still
running the rather ancient Slurm version 2.4.5.
How can I determine if a job was submitted with the --exclusive option
in Slurm 14.11?
Our partitions are configured with Shared=NO. In Slurm 2.4 "scontrol
show job" shows Shared=0 for --exclusive jobs, and Shared=OK
otherwise. "squeue -o %h" can also be used and will show yes or no.
We check this in our Epilog script, and do a more thorough cleaning of
nodes when a node exclusive job finishes. For jobs using only parts of a
node we need to be more careful, as other jobs can be running on the
node, or get started while the Epilog is running.
Unfortunately in Slurm 14.11 both --exclusive and other jobs show
Shared=0 in scontrol, and "no" when using squeue -o %h. For squeue the
change is even documented in the man page:
2.4:
%h Can the nodes allocated to the job be shared with other
jobs. (Valid for jobs only)
14.11:
%h Can the resources allocated to the job be shared with
other jobs. The resources to be shared can be nodes,
sockets, cores, or hyperthreads depending upon config-
uration. The value will be "yes" if the job was sub-
mitted with the shared option or the partition is con-
figured with Shared=Force. (Valid for jobs only)
--
Pär Lindfors, NSC
--
Morris "Moe" Jette
CTO, SchedMD LLC
Commercial Slurm Development and Support