Support for heterogeneous jobs will be available in Slurm version 17.11,
to be released in November 2017. While providing similar functionality
to the work described below, it will utilize a different code base
developed by SchedMD in conjunction with Jülich Supercomputing Centre.
Jacob
--
Jacob Jenson
COO, SchedMD LLC
Commercial Slurm Development and Support
On 7/13/2017 10:51 AM, Perry, Martin wrote:
This email is to announce the latest version of the job packs feature
(heterogeneous resources and MPI-MPMD tight integration support) as
open-source code.
This feature has been developed by Bull/Atos R&D over 2 years in a
close technical relationship with major customers.
You can find some SLUG presentations below:
_https://slurm.schedmd.com/SLUG16/Job_Packs_SUG_2016.pdf_
_https://slurm.schedmd.com/SLUG15/Heterogeneous_Resources_and_MPMD.pdf_
The initial plan was to integrate this feature within the official
version of Slurm 17.02. Unfortunately this will not happen due to a
difference in architectural approaches. In future version of Slurm,
our MPMD API may not be preserved.
In this context, we have decided to provide the job packs feature for
17.02 on the Bull/Atos github "as is" under GPL license. For the sake
of transparency, we have had several requests to do so, and we feel
this can be helpful for the community users to be able to experiment
the feature and to provide feedback on the functionality itself.//
The code can be cloned from this branch:
_https://github.com/RJMS-Bull/slurm/tree/dev-job-pack-17.02_
and the documentation can be found here:
_https://github.com/RJMS-Bull/slurm/blob/dev-job-pack-17.02/doc/html/job_packs.shtml_
For any feedback you may contact [email protected]_
<mailto:[email protected]> and [email protected]_
<mailto:[email protected]>.
Here is a selection of some of the most important changes provided
within the new functionality in this branch:
-introduce new packs dependencies including pack-leader and pack-members
-update srun/salloc/sbatch specification to support different
resources demands separated with semicolon
-update resource selection algorithms to support job packs functionality
-introduce --pack-group parameter within srun
-update PMI and PMI2 libraries to support MPI-MPMD and the possibility
to have different executables communicating in the same MPI_comm_world
-introduce new environment variables to reflect job packs
-update sacct,squeue,sinfo to support job packs -update scancel to
support the termination of a pack-member without terminating the whole
job packs.