Howdy all. We're running GE 2011.11p1, and either I've hit a bug or I'm not understanding the documentation regarding hold_jid correctly. The qsub man page states, regarding hold_jid, that "If any of the referenced jobs exits with exit code 100, the submitted job will remain ineligible for execution." But a simple test seems to dispute that.

1) Submit a job that does simply "sleep 30", but include "-l h_rt=10", so
   that SGE will kill the job.

2) Submit a second job where -hold_jid references the first job.

Given the runtime limit, the first job gets killed and qacct shows:

failed       100 : assumedly after job

However the second job ends up running anyway. Am I correct in thinking that it shouldn't do so?

Thanks.

--
Joshua Baker-LePain
QB3 Shared Cluster Sysadmin
UCSF
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to