Greetings everyone.
We are pleased to announce the release of 15.08.0! It contains many new
features and performance enhancements. Please read the RELEASE_NOTES
file to get an idea of the new items that have been added. The on-line
Slurm documentation has been updated to reflect this release. Thanks to
everyone that helped in this release.
Some notable changes are listed here.
-- Added TRES (Trackable resources) to track utilization of memory, GRES,
burst buffer, license, and any other configurable resources in the
accounting database.
-- Add configurable billing weight that takes into consideration any
TRES when
calculating a job's resource utilization.
-- Add configurable prioritization factor that takes into
consideration any
TRES when calculating a job's resource utilization.
-- Add burst buffer support infrastructure. Currently available plugin
include
burst_buffer/generic (uses administrator supplied programs to
manage file
staging) and burst_buffer/cray (uses Cray APIs to manage buffers).
-- Add power capping support for Cray systems with automatic
rebalancing of
power allocation between nodes.
-- Modify slurmctld outgoing RPC logic to support more parallel tasks
(up to
85 RPCs and 256 pthreads; the old logic supported up to 21 RPCs and 256
threads).
-- Add support for job dependencies joined with OR operator (e.g.
"--depend=afterok:123?afternotok:124").
-- Add advance reservation flag of "replace" that causes allocated
resources
to be replaced with idle resources. This maintains a pool of available
resources that maintains a constant size (to the extent possible).
-- Permit PreemptType=qos and PreemptMode=suspend,gang to be used
together.
A high-priority QOS job will now oversubscribe resources and gang
schedule,
but only if there are insufficient resources for the job to be started
without preemption. NOTE: That with PreemptType=qos, the partition's
Shared=FORCE:# configuration option will permit one job more per
resource
to be run than than specified, but only if started by preemption.
-- A partition can now have an associated QOS. This will allow a
partition
to have all the limits a QOS has. If a limit is set in both QOS
the partition QOS will override the job's QOS unless the job's QOS
has the
'OverPartQOS' flag set.
-- Expanded --cpu-freq parameters to include min-max:governor
specifications.
--cpu-freq now supported on salloc and sbatch.
-- Add support for optimized job allocations with respect to SGI Hypercube
topology.
NOTE: Only supported with select/linear plugin.
NOTE: The program contribs/sgi/netloc_to_topology can be used to build
Slurm's topology.conf file.
-- Add the ability for a compute node to be allocated to multiple
jobs, but
restricted to a single user. Added "--exclusive=user" option to salloc,
the scontrol and sview commands. Added new partition configuration
parameter
"ExclusiveUser=yes|no".
-- Verify that all plugin version numbers are identical to the component
attempting to load them. Without this verification, the plugin can
reference
Slurm functions in the caller which differ (e.g. the underlying
function's
arguments could have changed between Slurm versions).
NOTE: All plugins (except SPANK) must be built against the identical
version of Slurm in order to be used by any Slurm command or
daemon. This
should eliminate some very difficult to diagnose problems due to
use of old
plugins.
-- Optimize resource allocation for systems with dragonfly networks.
-- Added plugin to record job completion information using Elasticsearch.
Libcurl is required for build. Configure slurm.conf as follows
JobCompType=jobcomp/elasticsearch
JobCompLoc=http://YOUR_ELASTICSEARCH_SERVER:9200
-- DATABASE SCHEME HAS CHANGED. WHEN UPDATING THE MIGRATION PROCESS
MAY TAKE
SOME AMOUNT OF TIME DEPENDING ON HOW LARGE YOUR DATABASE IS. WHILE
UPDATING
NO RECORDS WILL BE LOST, BUT THE SLURMDBD MAY NOT BE RESPONSIVE
DURING THE
UPDATE. IT WILL ALSO NOT BE POSSIBLE TO AUTOMATICALLY REVERT THE
DATABASE
TO THE FORMAT FOR AN EARLIER VERSION OF SLURM. PLAN ACCORDINGLY.
-- The performance of Profiling with HDF5 is improved. In addition,
internal
structures are changed to make it easier to add new profile types,
particularly energy sensors. This has introduced an operational
issue. See
OTHER CHANGES.
-- MPI/MVAPICH plugin now requires Munge for authentication.
-- In order to support inter-cluster job dependencies, the MaxJobID
configuration parameter default value has been reduced from
4,294,901,760
to 2,147,418,112 and it's maximum value is now 2,147,463,647.
ANY JOBS WITH A JOB ID ABOVE 2,147,463,647 WILL BE PURGED WHEN SLURM IS
UPGRADED FROM AN OLDER VERSION!
We have also release one of the last tags of 14.11 in the form of 14.11.9.
Changes are listed here
-- Correct "sdiag" backfill cycle time calculation if it yields locks. A
microsecond value was being treated as a second value resulting in an
overflow in the calcuation.
-- Fix segfault when updating timelimit on jobarray task.
-- Fix to job array update logic that can result in a task ID of
4294967294.
-- Fix of job array update, previous logic could fail to update some tasks
of a job array for some fields.
-- CRAY - Fix seg fault if a blade is replaced and slurmctld is restarted.
-- Fix plane distribution to allocate in blocks rather than cyclically.
-- squeue - Remove newline from job array ID value printed.
-- squeue - Enable filtering for job state SPECIAL_EXIT.
-- Prevent job array task ID being inappropriately set to NO_VAL.
-- MYSQL - Make it so you don't have to restart the slurmctld
to gain the correct limit when a parent account is root and you
remove a subaccount's limit which exists on the parent account.
-- MYSQL - Close chance of setting the wrong limit on an association
when removing a limit from an association on multiple clusters
at the same time.
-- MYSQL - Fix minor memory leak when modifying an association but no
change was made.
-- srun command line of either --mem or --mem-per-cpu will override
both the
SLURM_MEM_PER_CPU and SLURM_MEM_PER_NODE environment variables.
-- Prevent slurmctld abort on update of advanced reservation that
contains no
nodes.
-- ALPS - Revert commit 2c95e2d22 which also removes commit 2e2de6a4
allowing
cray with the SubAllocate option to work as it did with 2.5.
-- Properly parse CPU frequency data on POWER systems.
-- Correct sacct.a man pages describing -i option.
-- Capture salloc/srun information in sdiag statistics.
-- Fix bug in node selection with topology optimization.
-- Don't set distribution when srun requests 0 memory.
-- Read in correct number of nodes from SLURM_HOSTFILE when specifying
nodes
and --distribution=arbitrary.
-- Fix segfault in Bluegene setups where RebootQOSList is defined in
bluegene.conf and accounting is not setup.
-- MYSQL - Update mod_time when updating a start job record or adding one.
-- MYSQL - Fix issue where if an association id ever changes on at least a
portion of a job array is pending after it's initial start in the
database it could create another row for the remain array instead
of using the already existing row.
-- Fix scheduling anomaly with job arrays submitted to multiple
partitions,
jobs could be started out of priority order.
-- If a host has suspended jobs do not reboot it. Reboot only hosts
with no jobs in any state.
-- ALPS - Fix issue when using --exclusive flag on srun to do the correct
thing (-F exclusive) instead of -F share.
-- Fix various memory leaks in the Perl API.
-- Fix a bug in the controller which display jobs in CF state as RUNNING.
-- Preserve advanced _core_ reservation when nodes
added/removed/resized on
slurmctld restart. Rebuild core_bitmap as needed.
-- Fix for non-standard Munge port location for srun/pmi use.
-- Fix gang scheduling/preemption issue that could cancel job at startup.
-- Fix a bug in squeue which prevented squeue -tPD to print array jobs.
-- Sort job arrays in job queue according to array_task_id when
priorities are
equal.
-- Fix segfault in sreport when there was no response from the dbd.
-- ALPS - Fix compile to not link against -ljob and -lexpat with every lib
or binary.
-- Fix testing for CR_Memory when CR_Memory and CR_ONE_TASK_PER_CORE
are used
with select/linear.
-- MySQL - Fix minor memory leak if a connection ever goes away whist
using it.
-- ALPS - Make it so srun --hint=nomultithread works correctly.
-- Prevent job array task ID from being reported as NO_VAL if last
task in the
array gets requeued.
-- Fix some potential deadlock issues when state files don't exist in the
association manager.
-- Correct RebootProgram logic when executed outside of a maintenance
reservation.
-- Requeue job if possible when slurmstepd aborts.
Both versions can be downloaded from the normal spot
http://schedmd.com/#repos.
--
Danny Auble
President, SchedMD LLC
Commercial Slurm Development and Support
===============================================================
Slurm User Group Meeting, 15-16 September 2015, Washington D.C.
http://slurm.schedmd.com/slurm_ug_agenda.html