[elasticluster] Announcing GPU support for SLURM clusters on AWS and Google Cloud

Riccardo Murri Thu, 18 Jan 2018 14:22:08 -0800

Hello!

It's a great pleasure to announce that I've just merged a large pull
reqeust implementing support for:


* requesting GPU accelerators on instances running on Google Cloud (on
  AWS no special support is necessary -- just choose a GPU-enabled
  instance type)
* installing CUDA (8.0, 9.0, or 9.1) and NVIDIA drivers on the
  GPU-enabled instances
* configuring SLURM to allocate the GPUs as "generic resources" (so you
  can e.g. request that a job uses 2 GPUs with `sbatch --gres=gpu:2 ...`)

In addition, the SLURM role allows configuring many more parameters in
`slurm.conf` through setup variables, including full support for the
"cgroup" plugins.

You can see examples configuration files for the new GPU-enabled
clusters here:

* 
https://github.com/gc3-uzh-ch/elasticluster/blob/master/examples/slurm-with-gpu-on-google.
* 
https://github.com/gc3-uzh-ch/elasticluster/blob/master/examples/slurm-with-gpu-on-aws.conf

Documentation for the new `cuda` role and the new SLURM variables can be
found at:

* http://elasticluster.readthedocs.io/en/latest/playbooks.html#cuda
* http://elasticluster.readthedocs.io/en/latest/playbooks.html#id3

Very special thanks go to Hatef Monajemi for pushing me through this, for
the initial implementation of the CUDA role, and for relentlessly
testing over the course of two months; I would also like to thank
@benpass for providing the initial implementation of the GPU-requesting
code in `gce.py`; David Koes for finding many issues and always suggesting
their solutions; and finally Karan Bhatia and Keith Binder for granting
me the GPU quota on GCE to run the final checks.

Now enjoy your new GPU-enabled clusters ;-)

Ciao,
R

--
Riccardo Murri / Email: [email protected] / Tel.: +41 77 458 98 32

-- 
You received this message because you are subscribed to the Google Groups 
"elasticluster" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[elasticluster] Announcing GPU support for SLURM clusters on AWS and Google Cloud

Reply via email to