I do not know what your use case is, but perhaps a job submit plugin would satisfy your needs:
http://www.schedmd.com/slurmdocs/job_submit_plugins.html
That code gets executed directly in the slurmctld daemon. There is also a SlurmctldProlog that could be used. Another option would be to modify SLURM's code directly, perhaps a minor change to select_p_jo_begin in the select plugin:
http://www.schedmd.com/slurmdocs/selectplugins.html

Processing 100 requests at the same time should not be a problem.

Quoting Arnau Bria <[email protected]>:

Hi all,

Few time ago I asked about the existence of cpu_factor concept in
slurm. As seems that the time normalization is not implemented in
slurm, I'm looking for solutions (this is basic for our scenario).

** Our time limit come from partition (queue) defaults. So, if the job
goes to queue long it will get a hard limit. We use slurmdb.

Some other experimented admin suggested to use prolog to modify job
time limit. Something like:

squeue -h -o %l -j $SLURM_JOB_ID
[...my opertains...
scontrol update jobid=$SLURM_JOB_ID timelimit=<new time>

this is simple enough, and works great.

But now I'm wondering if this method is robust enough, for exmaple, what
could happen if more that 100 (low limit) jobs start at once? how many
concurrent connections can scontrol handle?

Maybe Is better to implement it inside slurm's code...

Any suggestion is welcome.

TIA,
Arnau




Reply via email to