Dear all,
I am writing to ask you a question.
Is it possible to retrieve the status of cleared jobs (e.g. after completed
with either success or failed) from the Slurm rest interface ?
When a job (job id=131 in the example below) is cleared, the rest interface
returns this after some time after completion:
{"meta":{"plugin":{"type":"openapi/v0.0.36","name":"REST
v0.0.36"},"Slurm":{"version":{"major":20,"micro":7,"minor":11},"release":"20.11.7"}},"errors":[{"error":"_handle_job_get:
unknown job 131","error_code":0}],"jobs":[]}
I activated the job status storage in mysql:
sacct -j 131
JobID JobName Partition Account AllocCPUS State
ExitCode
------------ ---------- ---------- ---------- ---------- ----------
--------
131 testjob.sh cirasa 2 COMPLETED
0:0
131.batch batch 2 COMPLETED
0:0
131.0 hostname 2 COMPLETED
0:0
131.1 sleep 2 COMPLETED 0:0
but the rest service does not seem to pick the status from it.
Do you have hints?
Just to understand more:
- how many seconds the completed job stays available to be queried from
squeue or rest API methods? Can this "time-to-live-before-cleanup" be
configured, eventually increased a bit? This would be useful to avoid
polling the status very frequently.
- do we have a push mechanism to send job status to external web services,
rather than polling it using rest API methods?
Thanks very much for your help,
Cheers,
Simone
PS: Using Slurm v20.11.7 on Centos 7
****************************************************************
Simone Riggi, PhD
INAF, Osservatorio Astrofisico di Catania
Via S. Sofia 78
95123, Catania - Italy
phone: +39 095 7332 extension 282 (or 310)
e-mail: [email protected],
[email protected] <[email protected]>,
[email protected] <[email protected]>
skype: simone.riggi
****************************************************************