Slurm version 14.03.10 includes quite a few relatively minor bug fixes,
and will most likely be the last 14.03 release. Thanks to all those who
helped make this a very stable release.
We hope to officially tag 14.11.0 before SC14. Version 14.11.0-rc3
includes a few bug fixes discovered in recent testing but is looking
very stable. Thanks to everyone participating in the testing! If you
can, please test this release so we can attempt to fix as many issues as
we can before we tag 14.11.0.
Just a heads up, version 15.08 is already starting development we will
most likely tag a pre1 of this later this month.
Slurm downloads are available from http://www.schedmd.com/#repos.
Here are some snips from the NEWS file on what has changed since the
last releases.
* Changes in Slurm 14.03.10
===========================
-- Fix a few sacctmgr error messages.
-- Treat non-zero SlurmSchedLogLevel without SlurmSchedLogFile as a fatal
error.
-- Correct sched_config.html documentation SchedulingParameters
should be SchedulerParameters.
-- When using gres and cgroup ConstrainDevices set correct access
permission for the batch step.
-- Fix minor memory leak in jobcomp/mysql on slurmctld reconfig.
-- Fix bug that prevented preservation of a job's GRES bitmap on slurmctld
restart or reconfigure (bug was introduced in 14.03.5 "Clear record
of a
job's gres when requeued" and only applies when GRES mapped to specific
files).
-- BGQ: Fix race condition when job fails due to hardware failure and is
requeued. Previous code could result in slurmctld abort with NULL
pointer.
-- Prevent negative job array index, which could cause slurmctld to crash.
-- Fix issue with squeue/scontrol showing correct node_cnt when only tasks
are specified.
-- Check the status of the database connection before using it.
-- ALPS - If an allocation requests -n set the BASIL -N option to the
amount of tasks / number of node.
-- ALPS - Don't set the env var APRUN_DEFAULT_MEMORY, it is not needed
anymore.
-- Fix potential buffer overflow.
-- Give better estimates on pending node count if no node count is
requested.
-- BLUEGENE - Fix issue where requeuing jobs could cause an assert.
* Changes in Slurm 14.11.0rc3
=============================
-- Allow envs to override autotools binaries in autogen.sh
-- Added system services files.
-- If the jobs pends with DependencyNeverSatisfied keep it pending
even after
the job which it was depending upon was cleaned.
-- Let operators (in addition to user root and SlurmUser) see job
script for
other user's jobs.
-- Perl API modified to return node state of MIXED rather than
ALLOCATED if
only some CPUs allocated.
-- Double Munge connect retry timeout from 1 to 2 seconds.
-- sview - Remove unneeded code that was resolved globally in commit
98e24b0dedc.
-- Collect and report the accounting of the batch step and its children.
-- Add configure checks for faccessat and eaccess, and make use of one of
them if available.
-- Make configure --enable-developer also set --enable-debug
-- Introduce a SchedulerParameters variable kill_invalid_depend, if set
then jobs pending with invalid dependency are going to be terminated.
-- Move spank_user_task() call in slurmstepd after the task_g_pre_launch()
so that the task affinity information is available to spank.
-- Make /etc/init.d/slurm script return value 3 when the daemon is
not running. This is required by Linux Standard Base Core
Specification 3.1