I would like to implement Slurm in my current HPC system.
I have many Jobs divided into job arrays - which makes me cross the Slurm’s 67
Million JOBuid limit.
I've looked into the source code and it looks like the ID’s are being reused
(67 Mil jobs cycle) but Slurm can handle identical IDs with the help of another
UID in the accounting DB (called: db_index).
So I understand that I can submit more than 67 Million Jobs, but is it possible
to use the real unique ID from the accounting db for Slurm operations?
for example - check job status.
will it work if I won’t use accounting db at all?
also, I have an external application that manages jobs sent to the scheduler,
Is it ok for it to rely on the db_index for managing jobs on Slurm (is the
accounting db always up-to date)?
Thanks in advance,