Many thanks Matthieu!
Andy
On 02/12/2018 06:42 PM, Matthieu Hautreux wrote:
Hi,
your login node may have a heavy load while starting such a large
number of independant sruns.
This may induce issues not seen under normal load, like partial
read/write on sockets, triggering bugs in slurm,
Hi,
your login node may have a heavy load while starting such a large number of
independant sruns.
This may induce issues not seen under normal load, like partial read/write
on sockets, triggering bugs in slurm, for functions not properly protected
against such events.
Quickly looking at the