Hi,
I've got slurm-2.1.15 running and logging to mysql. We had this error:
error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
error: slurmdbd: agent queue is full, discarding request
slurmdbd was running and logging job requests. I restarted slurmdbd then
restarted slurmctld. The connection appears to have been re-established, no
errors in the logs and the mysql database is being populated. Pretty much
every field is being populated (JobId, Submit Time, timelimit) but for the
start and end times, they are all zeros. I did check the database for a job
that started and finished with a COMPLETED code after I went through the above
steps. What could be the reason for not recording those times?
Is it advised that slurmdbd be started before slurmctld?
Also what does this error mean in the slurmctld.log:
error: step_partial_comp: JobID=XXXX last=1, nodes=1
thanks!
-Aron