SLURM versions 2.3.2 and 2.4.0-pre2 are now available from
http://www.schedmd.com/#repos

Version 2.3.2 includes bugs found and fixed in the past six weeks as shown below. Version 2.4.0-pre2 includes new development for version 2.4 with a release scheduled in the second quarter of 2012.

* Changes in SLURM 2.3.2
========================
 -- Add configure option of "--without-rpath" which builds SLURM tools without
    the rpath option, which will work if Munge and BlueGene libraries are in
    the default library search path and make system updates easier.
 -- Fixed issue where if a job ended with ESLURMD_UID_NOT_FOUND and
    ESLURMD_GID_NOT_FOUND where slurm would be a little over zealous
    in treating missing a GID or UID as a fatal error.
 -- Backfill scheduling - Add SchedulerParameters configuration parameter of
    "bf_res" to control the resolution in the backfill scheduler's data about
when jobs begin and end. Default value is 60 seconds (used to be 1 second).
 -- Cray - Remove the "family" specification from the GPU reservation request.
 -- Updated set_oomadj.c, replacing deprecated oom_adj reference with
    oom_score_adj
 -- Fix resource allocation bug, generic resources allocation was ignoring the
    job's ntasks_per_node and cpus_per_task parameters. Patch from Carles
    Fenoy, BSC.
 -- Avoid orphan job step if slurmctld is down when a job step completes.
 -- Fix Lua link order, patch from Pär Andersson, NSC.
 -- Set SLURM_CPUS_PER_TASK=1 when user specifies --cpus-per-task=1.
 -- Fix for fatal error managing GRES. Patch by Carles Fenoy, BSC.
 -- Fixed race condition when using the DBD in accounting where if a job
    wasn't started at the time the eligible message was sent but started
before the db_index was returned information like start time would be lost.
 -- Fix issue in accounting where normalized shares could be updated
    incorrectly when getting fairshare from the parent.
 -- Fixed if not enforcing associations  but want QOS support for a default
    qos on the cluster to fill that in correctly.
 -- Fix in select/cons_res for "fatal: cons_res: sync loop not progressing"
    with some configurations and job option combinations.

* Changes in SLURM 2.4.0.pre2
=============================
 -- CRAY - Add support for GPU memory allocation using SLURM GRES (Generic
    RESource) support. Work by Steve Trofinoff, CSCS.
 -- Add support for job allocations with multiple job constraint counts. For
    example: salloc -C "[rack1*2&rack2*4]" ... will allocate the job 2 nodes
    from rack1 and 4 nodes from rack2. Support for only a single constraint
    name been added to job step support.
 -- BGQ - Remove old method for marking cnodes down.
 -- BGQ - Remove BGP images from view in sview.
 -- BGQ - print out failed cnodes in scontrol show nodes.
 -- BGQ - Add srun option of "--runjob-opts" to pass options to the runjob
    command.
 -- FRONTEND - handle step launch failure better.
 -- BGQ - Added a mutex to protect the now changing ba_system pointers.
 -- BGQ - added new functionality for sub-block allocations - no preemption
    for this yet though.
 -- Add --name option to squeue to filter output by job name. Patch from Yuri
    D'Elia.
-- BGQ - Added linking to runjob client libary which gives support to totalview
    to use srun instead of runjob.
 -- Add numeric range checks to scontrol update options. Patch from Phil
    Eckert, LLNL.
 -- Add ReconfigFlags configuration option to control actions of "scontrol
    reconfig". Patch from Don Albert, Bull.
 -- BGQ - handle reboots with multiple jobs running on a block.


Reply via email to