Hi Daniel,

On Wed, Feb 3, 2016 at 3:33 AM, Daniel Letai <[email protected]> wrote:
> The question is - does slurm also use the dev files to track the
> availability of the cards?
>
> I do not wish to drain any nodes with failing cards - just let slurm know
> about this dynamically so jobs requesting gpus are properly scheduled, while
> other jobs can use the "bad" nodes.

I don't have an answer to your question, but running "scontrol -dd
show node <nodename> | grep -i gres" reports a GresDrain property:
  Gres=gpu:8
  GresDrain=N/A
  GresUsed=gpu:4

No idea how to set this though, but if there is a way to drain
specific GRES, that could be a way to do what you want.

Cheers,
-- 
Kilian

Reply via email to