On 22/05/17 19:57, Baker D.J. wrote: > I’ve recently started using slurm v17.02.2, however something seems very > odd. For some reason, when for example jobs fail or exceed their > walltime limit, I see that compute nodes are being placed in drained or > draining state. Does anyone understand what might be wrong?
Anything setting a drain state is meant to also set a reason, what does "scontrol show node $NODE" say for these? Also are there any relevant messages in your slurmctld and slurmd logs? Best of luck, Chris -- Christopher Samuel Senior Systems Administrator Melbourne Bioinformatics - The University of Melbourne Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545