We are pleased to announce the availability of Slurm version 2.6.7, plus version 14.03.0-rc1 (release candidate 1). We plan to release version 14.03.0 by the end of the month. See the "RELEASE_NOTES" file in the distribution for a description of the major changes in version 14.03.

This will most likely be the last 2.6 release. 14.03 code has been frozen for development and will only get bug fixes from here on out. Thanks to all those that have contributed to the effort!

The Slurm distributions are available from:
http://www.schedmd.com/#repos

Bug fixes and enhancements in these 2 versions are listed below...

* Changes in Slurm 2.6.7
========================
-- Properly enforce a job's cpus-per-task option when a job's allocation is
    constrained on some nodes by the mem-per-cpu option.
 -- Correct the slurm.conf man pages and checkpoint_blcr.html page
    describing that jobs must be drained from cluster before deploying
    any checkpoint plugin. Corrected in version 14.03.
 -- Fix issue where if using munge and munge wasn't running and a slurmd
    needed to forward a message, the slurmd would core dump.
 -- Update srun.1 man page documenting the PMI2 support.
 -- Fix slurmctld core dump when a jobs gets its QOS updated but there
    is not a corresponding association.
-- If a job requires specific nodes and can not run due to those nodes being busy, the main scheduling loop will block those specific nodes rather than
    the entire queue/partition.
 -- Fix minor memory leak when updating a job's name.
-- Fix minor memory leak when updating a reservation on a partition using "ALL"
    nodes.
-- Fix minor memory leak when adding a reservation with a nodelist and core
    count.
 -- Update sacct man page description of job states.
-- BGQ - Fix minor memory leak when selecting blocks that can't immediately be
    placed.
 -- Fixed minor memory leak in backfill scheduler.
 -- MYSQL - Fixed memory leak when querying clusters.
 -- MYSQL - Fix when updating QOS on an association.
-- NRT - Fix to supply correct error messages to poe/pmd when a launch fails.
 -- Add SLURM_STEP_ID to Prolog environment.
-- Add support for SchedulerParameters value of bf_max_job_start that limits the total number of jobs that can be started in a single iteration of the
    backfill scheduler.
 -- Don't print negative number when dealing with large memory sizes with
    sacct.
 -- Fix sinfo output so that host in state allocated and mixed will not be
    merged together.
 -- GRES: Avoid crash if GRES configurations is inconstent.
 -- Make S_SLURM_RESTART_COUNT item available to SPANK.
 -- Munge plugins - Add sleep between retries if can't connect to socket.
-- Fix the database query to return all pending jobs in a given time interval.
 -- switch/nrt - Correct logic to get dynamic window count.
-- Remove need to use job->ctx_params in the launch plugin, just to simplify
    code.
 -- NRT - Fix possible memory leak if using multiple adapters.
 -- NRT - Fix issue where there are more than NRT_MAXADAPTERS on a system.
 -- NRT - Increase Max number of adapters from 8 -> 9
 -- NRT - Initialize missing variables when the PMD is starting a job.
 -- NRT - Fix issue where we are launching hosts out of numerical order,
    this would cause pmd's to hang.
 -- NRT - Change xmalloc's to malloc just to be safe.
 -- NRT - Sanity check to make sure a jobinfo is there before packing.
 -- Add missing options to the print of TaskPluginParam.
 -- Fix a couple of issues with scontrol reconfig and adding nodes to
    slurm.conf.  Rebooting daemons after adding nodes to the slurm.conf
    is highly recommended.

* Changes in Slurm 14.03.0rc1
==============================
 -- Fixed typos in srun_cr man page.
 -- Run job scheduling logic immediately when nodes enter service.
-- Added sbatch '--parsable' option to output only the job id number and the
    cluster name separated by a semicolon. Errors will still be displayed.
 -- Added failure management "slurmctld/nonstop" plugin.
-- Prevent jobs being killed when a checkpoint plugin is enabled or disabled.
 -- Update the documentation about SLURM_PMI_KVS_NO_DUP_KEYS environment
    variable.
 -- select/cons_res bug fix for range of node counts with --cpus-per-task
option (e.g. "srun -N2-3 -c2 hostname" would allocate 2 CPUs on the first
    node and 0 CPUs on the second node).
 -- Change reservation flags field from 16 to 32-bits.
 -- Add reservation flag value of "FIRST_CORES".
 -- Added the idea of Resources to the database.  Framework for handling
    license servers outside of Slurm.
-- When starting the slurmctld only send past job/node state information to
    accounting if running for the first time (should speed up startup
    dramatically on systems with lots of nodes or lots of jobs).
 -- Compile and run on FreeBSD 8.4.
-- Make job array expressions more flexible to accept multiple step counts in
    the expression (e.g. "--array=1-10:2,50-60:5,123").
 -- switch/cray - add state save/restore logic tracking allocated ports.
-- SchedulerParameters - Replace max_job_bf with bf_max_job_start (both will
    work for now).
 -- Add SchedulerParameters options of preempt_reorder_count and
    preempt_strict_order.
-- Make memory types in acct_gather uint64_t to handle systems with more than
    4TB of memory on them.
 -- BGQ - --export=NONE option for srun to make it so only the SLURM_JOB_ID
    and SLURM_STEP_ID env vars are set.
 -- Munge plugins - Add sleep between retries if can't connect to socket.
 -- Added DebugFlags value of "License".
 -- Added --enable-developer which will give you -Werror when compiling.
 -- Fix for job request with GRES count of zero.
 -- Fix a potential memory leak in hostlist.
-- Job array dependency logic: Cache results for major performance improvement. -- Add a new environment variable PMI2_CONNECT_TO_SERVER. If set in the MPI application environment the PMI2 library will connect the the PMI2 server
    (slurmstepd) instead of using the provided PMI_FD socket.
-- Modify squeue to support filter on job states Special_Exit and Resizing.
 -- Defer purging job record until after EpilogSlurmctld completes.
 -- Add -j option for jobid to sbcast.
 -- Fix handling RPCs from a 14.03 slurmctld to a 2.6 slurmd

Reply via email to