Howdy all. We're running GE 2011.11p1, and either I've hit a bug or I'm
not understanding the documentation regarding hold_jid correctly. The
qsub man page states, regarding hold_jid, that "If any of the referenced
jobs exits with exit code 100, the submitted job will remain ineligible
for execution." But a simple test seems to dispute that.
1) Submit a job that does simply "sleep 30", but include "-l h_rt=10", so
that SGE will kill the job.
2) Submit a second job where -hold_jid references the first job.
Given the runtime limit, the first job gets killed and qacct shows:
failed 100 : assumedly after job
However the second job ends up running anyway. Am I correct in thinking
that it shouldn't do so?
Thanks.
--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users