Am 02.05.2013 um 23:05 schrieb Happy Monk:

> thanks for the details, this is very helpful. 
> 
> So is there any way to run prolog scripts by selective jobs other than having 
> a separate queue ? Would like all these jobs to be in same queue.

If it's setup like I outlined the prolog won't hurt other jobs behavior unless 
someone gets the idea to attach a context variable on his own which is named 
NEXT_JOB by accident.

In case you want an additional layer of safety:

MAIN_JOB=$(qsub -terse -hold_jid $PREP_JOB  -ac NEXT_JOB=PENDING -v prolog=TRUE 
hostjob.sh)

The prolog will still be executed all the time, but you could check therein for 
the existence of the "prolog" environment variable and exit early otherwise; or 
use a second context variable for the same purpose:

MAIN_JOB=$(qsub -terse -hold_jid $PREP_JOB  -ac 
NEXT_JOB=PENDING,RUN_PROLOG=ADJUST_HOLD_JID hostjob.sh)

-- Reuti


> On Thu, May 2, 2013 at 11:35 AM, Reuti <[email protected]> wrote:
> Hi,
> 
> Am 02.05.2013 um 19:31 schrieb Happy Monk:
> 
> > Thanks for the quick reply Reuti.
> >
> > How can I restrict prolog to only certain jobs ? Here is the LSF recipe 
> > that we are trying to implement in SGE
> >
> >
> > #!/bin/bash
> >
> > bsub < preproc.sh
> >
> > echo $LSB_JOBID
> 
> `bsub` can change the value of an environment variable in the actual shell 
> process - interesting. Hence it's more like a sourced script than a started 
> child process with its own environment.
> 
> 
> > bsub -w 'done($LSB_JOBID)' < hostjob.sh
> >
> > echo $LSB_JOBID
> >
> > bsub -w 'started($LSB_JOBID)' < computejob.sh
> >
> > echo $LSB_JOBID
> 
> On the hand one could add an additional environment variable to this new job, 
> where the real condition for each "-hold_jid" is stated. But this way it 
> would be necessary for the orignal main job to parse all outputs of all jobs 
> for the existence of this variable.
> 
> Maybe a shorter way could be to add the next job id to the context of the 
> main job. In SGE the context of a job is meta data unrelated to SGE's 
> handling and also unrelated to the jobs environment, it's like a comment. I 
> mean:
> 
> #!/bin/sh
> PREP_JOB=$(qsub -terse preproc.sh)
> MAIN_JOB=$(qsub -terse -hold_jid $PREP_JOB hostjob.sh)
> NEXT_JOB=$(qsub -terse -hold_jid $MAIN_JOB computejob.sh)
> qalter -ac NEXT_JOB=$NEXT_JOB $MAIN_JOB
> 
> Then the prolog has to scan the `qstat -j $JOB_ID`, i.e. his own job number, 
> whether there is an entry like:
> 
> context:                    NEXT_JOB=1234
> 
> and if yes, use `qalter` to apply the removal only to this job id.
> 
> NB: In principle there is a race condition with this setup: if the prolog of 
> the main job runs before `qalter` for the follow up job was applied, it might 
> miss this necessity. But this would mean that the `preproc.sh` has almost no 
> runtime and the scheduled hostjob.sh starts more or less instantly. If this 
> could happen it needs to be adjusted:
> 
> #!/bin/sh
> PREP_JOB=$(qsub -terse preproc.sh)
> MAIN_JOB=$(qsub -terse -hold_jid $PREP_JOB -ac NEXT_JOB=PENDING hostjob.sh)
> NEXT_JOB=$(qsub -terse -hold_jid $MAIN_JOB computejob.sh)
> qalter -sc NEXT_JOB=$NEXT_JOB $MAIN_JOB
> 
> Then the prolog could wait or rise an error if it sees NEXT_JOB=PENDING 
> instead of a job id there.
> 
> -- Reuti
> 
> 
> > On Thu, May 2, 2013 at 9:00 AM, Reuti <[email protected]> wrote:
> > Hi,
> >
> > Am 02.05.2013 um 17:10 schrieb Happy Monk:
> >
> > > Is there any way to release hold of a job immediately after the dependent 
> > > job started, usually this hold is released after execution of the 
> > > dependent job.
> > >
> > > This function is available in LSF but checking whether its also available 
> > > in SGE or not.
> >
> > Not directly. You could use a queue prolog to remove the actual starting 
> > job from all jobs which depend on this one. This makes it necessary, that 
> > all exechosts are also submission hosts.
> >
> > To remove a complete -hold_jid list, you can give the job id 0 there to 
> > `qalter`. As this job id will never be a real job, it always satisfies the 
> > condition as being completed already.
> >
> > -- Reuti
> >
> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to