thanks for the details, this is very helpful. So is there any way to run prolog scripts by selective jobs other than having a separate queue ? Would like all these jobs to be in same queue.
On Thu, May 2, 2013 at 11:35 AM, Reuti <[email protected]> wrote: > Hi, > > Am 02.05.2013 um 19:31 schrieb Happy Monk: > > > Thanks for the quick reply Reuti. > > > > How can I restrict prolog to only certain jobs ? Here is the LSF recipe > that we are trying to implement in SGE > > > > > > #!/bin/bash > > > > bsub < preproc.sh > > > > echo $LSB_JOBID > > `bsub` can change the value of an environment variable in the actual shell > process - interesting. Hence it's more like a sourced script than a started > child process with its own environment. > > > > bsub -w 'done($LSB_JOBID)' < hostjob.sh > > > > echo $LSB_JOBID > > > > bsub -w 'started($LSB_JOBID)' < computejob.sh > > > > echo $LSB_JOBID > > On the hand one could add an additional environment variable to this new > job, where the real condition for each "-hold_jid" is stated. But this way > it would be necessary for the orignal main job to parse all outputs of all > jobs for the existence of this variable. > > Maybe a shorter way could be to add the next job id to the context of the > main job. In SGE the context of a job is meta data unrelated to SGE's > handling and also unrelated to the jobs environment, it's like a comment. I > mean: > > #!/bin/sh > PREP_JOB=$(qsub -terse preproc.sh) > MAIN_JOB=$(qsub -terse -hold_jid $PREP_JOB hostjob.sh) > NEXT_JOB=$(qsub -terse -hold_jid $MAIN_JOB computejob.sh) > qalter -ac NEXT_JOB=$NEXT_JOB $MAIN_JOB > > Then the prolog has to scan the `qstat -j $JOB_ID`, i.e. his own job > number, whether there is an entry like: > > context: NEXT_JOB=1234 > > and if yes, use `qalter` to apply the removal only to this job id. > > NB: In principle there is a race condition with this setup: if the prolog > of the main job runs before `qalter` for the follow up job was applied, it > might miss this necessity. But this would mean that the `preproc.sh` has > almost no runtime and the scheduled hostjob.sh starts more or less > instantly. If this could happen it needs to be adjusted: > > #!/bin/sh > PREP_JOB=$(qsub -terse preproc.sh) > MAIN_JOB=$(qsub -terse -hold_jid $PREP_JOB -ac NEXT_JOB=PENDING hostjob.sh) > NEXT_JOB=$(qsub -terse -hold_jid $MAIN_JOB computejob.sh) > qalter -sc NEXT_JOB=$NEXT_JOB $MAIN_JOB > > Then the prolog could wait or rise an error if it sees NEXT_JOB=PENDING > instead of a job id there. > > -- Reuti > > > > On Thu, May 2, 2013 at 9:00 AM, Reuti <[email protected]> > wrote: > > Hi, > > > > Am 02.05.2013 um 17:10 schrieb Happy Monk: > > > > > Is there any way to release hold of a job immediately after the > dependent job started, usually this hold is released after execution of the > dependent job. > > > > > > This function is available in LSF but checking whether its also > available in SGE or not. > > > > Not directly. You could use a queue prolog to remove the actual starting > job from all jobs which depend on this one. This makes it necessary, that > all exechosts are also submission hosts. > > > > To remove a complete -hold_jid list, you can give the job id 0 there to > `qalter`. As this job id will never be a real job, it always satisfies the > condition as being completed already. > > > > -- Reuti > > > >
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
