[slurm-dev] SLURM versions 2.3.5 and 2.4.0-rc1 are now available

Danny Auble Wed, 16 May 2012 13:26:09 -0700

SLURM versions 2.3.5 and 2.4.0-rc1 are now available from
http://www.schedmd.com/#repos
A description of the changes is appended.


This will most likely be the last 2.3 release unless a 2.3.6 is really 
warranted.

Development for 2.4 has been halted and only bug fixes will be applied 
from now on.  Our plans are to release an rc2 in a couple of weeks and a 
2.4.0-1 a couple of weeks after that.  Please test 2.4 and report any 
bugs to us through http://bugs.schedmd.com or through the slurm-dev list.

Future developments will be in 2.5 released later this year (planned for 
October).  We will release a 2.5.0-pre1 shortly.

* Changes in SLURM 2.3.5
========================
  -- Improve support for overlapping advanced reservations. Patch from
     Bill Brophy, Bull.
  -- Modify Makefiles for support of Debian hardening flags. Patch from
     Simon Ruderich.
  -- CRAY: Fix support for configuration with SlurmdTimeout=0 (never mark
     node that is DOWN in ALPS as DOWN in SLURM).
  -- Fixed the setting of SLURM_SUBMIT_DIR for jobs submitted by Moab 
(BZ#1467).
     Patch by Don Lipari, LLNL.
  -- Correction to init.d/slurmdbd exit code for status option. Patch by 
Bill
     Brophy, Bull.
  -- When the optional max_time is not specified for --switches=count, 
the site
     max (SchedulerParameters=max_switch_wait=seconds) is used for the job.
     Based on patch from Rod Schultz.
  -- Fix bug in select/cons_res plugin when used with topology/tree and 
a node
     range count in job allocation request.
  -- Fixed moab_2_slurmdb.pl script to correctly work for end records.
  -- Add support for new SchedulerParameters of max_depend_depth 
defining the
     maximum number of jobs to test for circular dependencies (i.e. job 
A waits
     for job B to start and job B waits for job A to start). Default 
value is
     10 jobs.
  -- Fix potential race condition if MinJobAge is very low (i.e. 1) and 
using
     slurmdbd accounting and running large amounts of jobs (>50 sec).  Job
     information could be corrupted before it had a chance to reach the DBD.
  -- Fix state restore of job limit set from admin value for min_cpus.
  -- Fix clearing of limit values if an admin removes the limit for max cpus
     and time limit where it was previously set by an admin.
  -- Fix issue where log message is more than 256 chars and then has a 
format.
  -- Fix sched/wiki2 to support job account name, gres, partition name, 
wckey,
     or working directory that contains "#" (a job record separator). 
Also fix
     for wckey or working directory that contains a double quote '\"'.
  -- CRAY - fix for handling memory requests from user for an allocation.
  -- Add support for switches parameter to the job_submit/lua plugin. 
Work by
     Par Andersson, NSC.
  -- Fix to job preemption logic to preempt multiple jobs at the same time.
  -- Fix minor issue where uid and gid were switched in sview for submitting
     batch jobs.
  -- Fix possible illegal memory reference in slurmctld for job step with
     relative option. Work by Matthieu Hautreux (CEA).
  -- Reset priority of system held jobs when dependency is satisfied. 
Work by
     Don Lipari, LLNL.

* Changes in SLURM 2.4.0.rc1
=============================
  -- Improve task binding logic by making fuller use of HWLOC library,
     especially with respect to Opteron 6000 series processors. Work 
contributed
     by Komoto Masahiro.
  -- Add new configuration parameter PriorityFlags, based upon work by
     Carles Fenoy (Barcelona Supercomputer Center).
  -- Modify the step completion RPC between slurmd and slurmstepd in 
order to
     eliminate a possible deadlock. Based on work by Matthieu Hautreux, CEA.
  -- Change the owner of slurmctld and slurmdbd log files to the appropriate
     user. Without this change the files will be created by and owned by the
     user starting the daemons (likely user root).
  -- Reorganize the slurmstepd logic in order to better support NFS and
     Kerberos credentials via the AUKS plugin. Work by Matthieu 
Hautreux, CEA.
  -- Fix bug in allocating GRES that are associated with specific CPUs. 
In some
     cases the code allocated first available GRES to job instead of 
allocating
     GRES accessible to the specific CPUs allocated to the job.
  -- spank: Add callbacks in slurmd: slurm_spank_slurmd_{init,exit}
     and job epilog/prolog: slurm_spank_job_{prolog,epilog}
  -- spank: Add spank_option_getopt() function to api
  -- Change resolution of switch wait time from minutes to seconds.
  -- Added CrpCPUMins to the output of sshare -l for those using hard limit
     accounting.  Work contributed by Mark Nelson.
  -- Added mpi/pmi2 plugin for complete support of pmi2 including acquiring
     additional resources for newly launched tasks. Contributed by 
Hongjia Cao,
     NUDT.
  -- BGQ - fixed issue where if a user asked for a specific node count 
and more
     tasks than possible without overcommit the request would be allowed 
on more
     nodes than requested.
  -- Add support for new SchedulerParameters of bf_max_job_user, maximum 
number
     of jobs to attempt backfilling per user. Work by Bjï¿½rn-Helge Mevik,
     University of Oslo.
  -- BLUEGENE - fixed issue where MaxNodes limit on a partition only limited
     larger than midplane jobs.
  -- Added cpu_run_min to the output of sshare --long.  Work contributed by
     Mark Nelson.
  -- BGQ - allow regular users to resolve Rack-Midplane to AXYZ coords.
  -- Add sinfo output format option of "%R" for partition name without "*"
     appended for default partition.
  -- Cray - Add support for zero compute note resource allocation to run 
batch
     script on front-end node with no ALPS reservation. Useful for pre- 
or post-
     processing.
  -- Support for cyclic distribution of cpus in task/cgroup plugin from 
Martin
     Perry, Bull.
  -- GrpMEM limit for QOSes and associations added Patch from 
Bjï¿½rn-Helge Mevik,
     University of Oslo.
  -- Various performance improvements for up to 500% higher throughput 
depending
     upon configuration. Work supported by the Oak Ridge National Laboratory
     Extreme Scale Systems Center.
  -- Added jobacct_gather/cgroup plugin.  It is not advised to use this in
     production as it isn't currently complete and doesn't provide an 
equivalent
     substitution for jobacct_gather/linux yet. Work by Martin Perry, Bull.

[slurm-dev] SLURM versions 2.3.5 and 2.4.0-rc1 are now available

Reply via email to