There are at least 2 important things: 1) task processing time - how much time does it take to process a file?? If the time is short, then the overhead of job submission & scheduling can be huge compare to task processing time.
2) task wait handling - in Grid Engine, if the job script exits, then Grid Engine will clean up all the processes spawned by the job. So if you use the 2nd approach, you will need to make sure that all child processes are done before exiting the script. Rayson On Thu, Feb 2, 2012 at 10:33 AM, Robert Hutton <[email protected]> wrote: > Hi Everyone, > > I've just set up a small Grid Engine cluster, but I'm new to using Grid > Engine, and would like some advice on the best way to submit jobs that > in turn submit jobs. What I'd like to do is: > > Run a regular shell script that loops over all files in a directory, and > submits a job for each file. This job will create a derived file. Once > this file is created, I'd like to submit /two/ jobs for each derived > file, which can run concurrently with each other and produce two > separate derived files from the first file. > > Is the right approach to have: > > - a regular shell script, say run.sh that does a loop over all files and > submits a job for each with "qsub process.sh $filename" > > - the job script called process.sh that runs the command to create the > derived file, then runs qsub on the two subcommands that rely on the > output file from the first command, with something like: > qsub subcommand1 $derived_file > ${derived_file}1 > qsub subcommand2 $derived_file > ${derived_file}2 > > > Or would it be better to just have a single job script that requests two > slots, and does all of the commands, calling subshells for the two > subcommands? Something like: > > for f in directory/* > do > command1 "$f" > "$f.out" > (command2 "$f.out" > "$f.out2") & > (command3 "$f.out" > "$f.out3") & > done > > Or are these just completely wrong approaches? Is there a better way? > > Thanks in advance, > > Rob > > -- > Robert Hutton > Senior Systems and Database Administrator > Centre for Genomics and Global Health <http://cggh.org> > The Wellcome Trust Centre for Human Genetics > Roosevelt Drive > Oxford > OX3 7BN > United Kingdom > Tel: +44 (0)1865 287721 > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
