On Mon, Feb 6, 2017 at 10:17 AM, Hans-Nikolai Viessmann <h...@hw.ac.uk> wrote:
>
> I had just added the DebugFlags setting to slurm.conf on the head node
> and did not sychronise it with the nodes. I doubt that this could cause the
> problem I described as it was occuring before I made the change to
> slurm.conf.
>
> One thing I did notice is this error occuring every once and a while:
>
> [2016-12-30T17:36:50.963] error: gres_plugin_node_config_unpack: gres/gpu
> lacks File parameter for node gpu07
> [2016-12-30T17:36:50.963] error: gres_plugin_node_config_unpack: gres/gpu
> lacks File parameter for node gpu04
> [2016-12-30T17:36:50.963] error: gres_plugin_node_config_unpack: gres/gpu
> lacks File parameter for node gpu01
> [2016-12-30T17:36:50.963] error: gres_plugin_node_config_unpack: gres/gpu
> lacks File parameter for node gpu05
> [2016-12-30T17:36:50.963] error: gres_plugin_node_config_unpack: gres/gpu
> lacks File parameter for node gpu02
> [2016-12-30T17:36:50.964] error: gres_plugin_node_config_unpack: gres/gpu
> lacks File parameter for node gpu06
> [2016-12-30T17:36:50.966] error: gres_plugin_node_config_unpack: gres/gpu
> lacks File parameter for node gpu03
>
> Is it possible that I need to specify the Gres Type for the other nodes as
> well, even though that
> have only one GPU each?


i'm not an expert, but i believe your gres.conf is incorrect.  Ours
looks like this

name=hostname file=/dev/nvidia0 type=k10

i think the issue is that slurm is trying to match your hostname to
the gres file to see what matches and can't

Reply via email to