[slurm-dev] dynamic gres scheduling

Daniel Letai Wed, 03 Feb 2016 03:33:49 -0800


A similar question has been asked before (not by me), without an answer:
https://groups.google.com/forum/?hl=en#!topic/slurm-devel/4xkvs0dgYu8

Specifically - suppose I have a gpu cluster, 2 gpus per node, where somegpus might or might not function correctly (due to heat/fwissues/malfunction/...), so some nodes might only present 1 gpu, andothers no gpu at all.

Using a gres.conf file with device nodes allows slurm to bind devices tojobs.The question is - does slurm also use the dev files to track theavailability of the cards?

I do not wish to drain any nodes with failing cards - just let slurmknow about this dynamically so jobs requesting gpus are properlyscheduled, while other jobs can use the "bad" nodes.

My healthcheck agent on the nodes can add/remove device files for anygpu based on it's thresholds.

Based on the above I would expect the following 4 configurationconsiderations:

1. gres.conf statically holds the "optimal" gpu deployment (assume allis well)

2. slurm.conf GresTypes=gpu

3. slurm.conf NodeName Gres=gpu:2 <-- This will presumably drain anynode with less than 2 gpus ?4. FastSchedule=0 <-- Together with NO gres in the NodeName line toensure nodes do not drain needlessly.


Is that correct?
Are there better solutions to dynamically track availability of resources?

Currently with LSF we are using a custom elim script to let lsf knowabout the availability of the resources.

[slurm-dev] dynamic gres scheduling

Reply via email to