On Friday, 10 July 2020 3:34:44 PM PDT Janna Ore Nugent wrote: > I’ve got an intermittent situation with gpu nodes that sinfo says are > available and idle, but squeue reports as “ReqNodeNotAvail”. We’ve cycled > the nodes to restart services but it hasn’t helped. Any suggestions for > resolving this or digging into it more deeply?
What does "scontrol show job $JOB" say for an affected job, and what does "scontrol show node $NODE" look like for one of these nodes? All the best, Chris -- Chris Samuel : http://www.csamuel.org/ : Berkeley, CA, USA