Slurm version 2.5.2 is now available with the bug fixes described  
below. We have also made availablea  pre-release of version 2.6,(still  
under development). Notable features in v2.6 include support for job  
arrays and accounting for a job's energy consumption using IPMI. The  
job array documentation is available here:
http://www.schedmd.com/slurmdocs/job_array.html

The latest versions of Slurm are available from:
http://www.schedmd.com/#repos


* Changes in SLURM 2.5.2
========================
  -- Fix advanced reservation recovery logic when upgrading from version 2.4.
  -- BLUEGENE - fix for QOS/Association node limits.
  -- Add missing "safe" flag from print of AccountStorageEnforce option.
  -- Fix logic to optimize GRES topology with respect to allocated CPUs.
  -- Add job_submit/all_partitions plugin to set a job's default partition
     to ALL available partitions in the cluster.
  -- Modify switch/nrt logic to permit build without libnrt.so library.
  -- Handle srun task launch failure without duplicate error messages or abort.
  -- Fix bug in QoS limits enforcement when slurmctld restarts and user not yet
     added to the QOS list.
  -- Fix issue where sjstat and sjobexitmod was installed in 2 different RPMs.
  -- Fix for job request of multiple partitions in which some partitions lack
     nodes with required features.
  -- Permit a job to use a QOS they do not have access to if an administrator
     manually set the job's QOS (previously the job would be rejected).
  -- Make more variables available to job_submit/lua plugin: slurm.MEM_PER_CPU,
     slurm.NO_VAL, etc.
  -- Fix topology/tree logic when nodes defined in slurm.conf get re-ordered.
  -- In select/cons_res, correct logic to allocate whole sockets to jobs. Work
     by Magnus Jonsson, Umea University.
  -- In select/cons_res, correct logic when job removed from only some nodes.
  -- Avoid apparent kernel bug in 2.6.32 which apparently is solved in
     at least 3.5.0.  This avoids a stack overflow when running jobs on
     more than 120k nodes.
  -- BLUEGENE - If we made a block that isn't runnable because of a overlapping
     block, destroy it correctly.
  -- Switch/nrt - Dynamically load libnrt.so from within the plugin as needed.
     This eliminates the need for libnrt.so on the head node.
  -- BLUEGENE - Fix in reservation logic that could cause abort.

* Changes in SLURM 2.6.0-pre1
=============================
  -- Add "state" field to job step information reported by scontrol.
  -- Notify srun to retry step creation upon completion of other job steps
     rather than polling. This results in much faster throughput for job step
     execution with --exclusive option.
  -- Added "ResvEpilog" and "ResvProlog" configuration parameters to execute a
     program at the beginning and end of each reservation.
  -- Added "slurm_load_job_user" function. This is a variation of
     "slurm_load_jobs", but accepts a user ID argument, potentially resulting
     in substantial performance improvement for "squeue --user=ID"
  -- Added "slurm_load_node_single" function. This is a variation of
     "slurm_load_nodes", but accepts a node name argument, potentially  
resulting
     in substantial performance improvement for "sinfo --nodes=NAME".
  -- Added "HealthCheckNodeState" configuration parameter identify node states
     on which HealthCheckProgram should be executed.
  -- Remove sacct --dump --formatted-dump options which were deprecated in
     2.5.
  -- Added support for job arrays (phase 1 of effort). See "man sbatch" option
     -a/--array for details.
  -- Add new AccountStorageEnforce options of 'nojobs' and 'nosteps' which will
     allow the use of accounting features like associations, qos and limits but
     not keep track of jobs or steps in accounting.
  -- Cray - Add new cray.conf parameter of "AlpsEngine" to specify the
     communication protocol to be used for ALPS/BASIL.
  -- select/cons_res plugin: Correction to CPU allocation count logic in for
     cores without hyperthreading.
  -- Added new SelectTypeParameter value of "CR_ALLOCATE_FULL_SOCKET".
  -- Added PriorityFlags value of "TICKET_BASED" and merged  
priority/multifactor2
     plugin into priority/multifactor plugin.
  -- Add "KeepAliveTime" configuration parameter controlling how long sockets
     used for srun/slurmstepd communications are kept alive after disconnect.
  -- Added SLURM_SUBMIT_HOST to salloc, sbatch and srun job environment.
  -- Added SLURM_ARRAY_TASK_ID to environment of job array.
  -- Added squeue --array/-r option to optimize output for job arrays.
  -- Added "SlurmctldPlugstack" configuration parameter for generic stack of
     slurmctld daemon plugins.
  -- Removed contribs/arrayrun tool. Use native support for job arrays.
  -- Modify default installation locations for RPMs to match "make install":
     _prefix /usr/local
     _slurm_sysconfdir %{_prefix}/etc/slurm
     _mandir %{_prefix}/share/man
     _infodir %{_prefix}/share/info
  -- Add acct_gather_energy/ipmi which works off freeipmi for energy gathering

Reply via email to