Hi Josh, Josh England <[email protected]> writes:
> We're running slurm-2.2.4 on CentOS-5.5 using sched/wiki to interface to
> a custom scheduler, and there seems to be a bug happening anytime a job
> is requeued in slurm (either manually or due to node failure).
This sounds familiar. I think this is the same issue that we found and
fixed for sched/wiki2 back in May (included in 2.2.6).
Please look at this thread:
https://groups.google.com/group/slurm-devel/browse_thread/thread/8c2e072f94873103?pli=1
The fix is a one-line change that resets a jobs priority in
/src/plugins/sched/wiki2/job_requeue.c when requeued:
https://github.com/paran1/slurm/commit/8212b71ec7480cf8bf292fefdb5547bc4a79dbc2
You most likely have to add something very similar to the sched/wiki
plugin code.
Regards,
Pär Andersson
NSC
pgpOfIUULQbOp.pgp
Description: PGP signature
