On 06-12-16 10:49, David van Leeuwen wrote:
"gres/gpu count too low (0 < 1)"
Last time I saw this I had to restart the slurmd on that node (a simple scontrol reconfigure was not enough).
I guess this message indicates a discrepancy between the number of GPU resources detected by slurmd at startup, and the number specified in the slurm.conf (and used by slurmctld).
In the end, I even had to restart both slurmd and slurmctld to get the GPUs registered properly (including the CPU specification in the gres.conf). (And I then repeated this just in case the order was important.;-))
Best, Robbert