Slurm version 2.5.3 is now available with the bug fixes listed below. Of particular note, SchedMD has been working with the Swiss National Supercomputing Centre to identify and fix a Slurm bug which can cause the slurmctld daemon to terminate with an invalid memory reference. This bug may have been reported by several sites in the past couple of weeks.
The latest versions of Slurm are available from http://www.schedmd.com/#repos The fix for the invalid memory reference is available from https://github.com/SchedMD/slurm/commit/ff26cc50db9e2fe2f9745a16c8c59fd3e0bd7ae8 * Changes in SLURM 2.5.3 ======================== -- Gres/gpu plugin - If no GPUs requested, set CUDA_VISIBLE_DEVICES=NoDevFiles. This bug was introduced in 2.5.2 for the case where a GPU count was configured, but without device files. -- task/affinity plugin - Fix bug in CPU masks for some processors. -- Modify sacct command to get format from SACCT_FORMAT environment variable. -- BGQ - Changed order of library inclusions and fixed incorrect declaration to compile correctly on newer compilers -- Fix for not building sview if glib exists on a system but not the gtk libs. -- BGQ - Fix for handling a job cleanup on a small block if the job has long since left the system. -- Fix race condition in job dependency logic which can result in invalid memory reference.
