On 6 August 2012 11:33, Hung-sheng Tsao <[email protected]> wrote: > Keep in mind > That you can always copy the result back to central server > Then copy to the new execd host > So you do not need to run the next. Job in the same host > Regards I would assume there is sufficient data involved that moving it around would be expensive....
William > > > Sent from my iPhone > > On Aug 6, 2012, at 4:33 AM, Gaya Nadarajan <[email protected]> wrote: > >> Quoting William Hay <[email protected]> on Mon, 6 Aug 2012 08:51:14 +0100: >> >>> On 6 August 2012 08:26, Gaya Nadarajan <[email protected]> wrote: >>>> Quoting Reuti <[email protected]> on Fri, 3 Aug 2012 12:16:02 >>>> +0200: >>>> >>>>> Hi, >>>>> >>>>> Am 03.08.2012 um 11:55 schrieb Gaya Nadarajan: >>>>> >>>>>> I'm relatively new to sge, however I have the task of making >>>>>> optimal use of 7 multi-core VM nodes (5 in a cluster and 2 >>>>>> individual) for a set of tasks. The grid engine is installed in one >>>>>> of the individual VMs which is serving as the master node. >>>>>> >>>>>> I have a set of independent jobs (in hundreds or thousands) which >>>>>> each have 3 subtasks that have to be run in sequence). I would like >>>>>> a node to be assigned to the chunk of 3 jobs so that they can share >>>>>> the data between them. >>>>>> >>>>>> Reading the sge manual suggests that a resource is really a queue. >>>>> >>>>> Yes, you can say so. You submit a job and SGE selects a queue >>>>> instance (i.e. a queue on a particular exechost) for you which >>>>> fulfills the resource requests you specified. >>>>> >>>>> >>>>>> Should I create many queues, each with the 3 subtasks so that the >>>>>> data and task dependencies could be dealt with? >>>>> >>>>> Usually it's best to have as few queues as possible. >>>>> >>>>> Running the jobs in sequence is no problem, you can use -hold_jid >>>>> for it. But the "problem" is the temporary data. I assume, you want >>>>> the temporary data on the local disk and not a shared file space, >>>>> where it can be accessed by all nodes. The best would be to assemble >>>>> one job from your 3 subtasks instead. This way you can also use the >>>>> $TMPDIR, which is created and removed by SGE - preferable on a local >>>>> /scratch file space. >>>>> >>>> >>>> What you are saying is merge the 3 subtasks into one job? Is there >>>> anyway this can be expressed in sge job submission? >>> >>> We're suggesting you avoid using grid engine features dor this and >>> just write a fancy job script something like: >>> >>> >>> #!/bin/bash >>> #$ -l h_rt=72:0:0 >>> #$ -l h_vmem=2G >>> scp user@host:/path/to/data.tar.gz ${TMPDIR} >>> cd ${TMPDIR} >>> tar -xvzf data.tar.gz >>> task1 >>> task2 >>> $(generate command for task 3) >>> >>> If this doesn't work for you then by making your worker nodes submit >>> nodes as well (or ssh'ing back to a submit node) >>> you could have each job qsub its successor using -l h=$(hostname) to >>> ensure it ends up on the same node >> >> >> I didn't give enough background on my work. My program itself generates the >> tasks automatically (using planning) from other programs invoking it with >> options passed in, schedules the tasks for execution and monitors their >> progress (and updates a DB accordingly). I'm using the DRMAA_API to dispatch >> jobs to sge. I'm trying to control this dependency from within my program >> itself, which is proving a bit tricky. >> >> This sounds like a good option, instead of generating and dispatching tasks >> for execution one by one using sge setting options, I should generate a >> script for each chunk of 3 subtasks and send the script for execution. >> Generating task 3 would require a callback to my program which is not that >> straightforward right now but I should figure out what would be best to get >> things to work. >> >> Thanks and do post any more comments or suggestions. >> >> Gaya >> >>> >>> >>> >>> >>> >>>> >>>> You are right regarding the data dependency, all 3 subtasks depend on >>>> the same data which is downloaded locally. While I can use hold_jid to >>>> state the dependency between subtask 1 and 2, the command for subtask >>>> 3 can only be generated after subtask 2 has finished executing. The >>>> parameters to subtask 3 rely on the result produced by subtask 2. So I >>>> can't even run it with hold_jib option because the command itself is >>>> not complete due to missing parameters. >>>> >>>> >>>> Any thoughts on these? Thanks. >>>> >>>> >>>>> Otherwise you would need to tell the other two jobs to run on the >>>>> same node the first task was scheduled to (or route all by hand >>>>> during the submission to individual particular exechosts). >>>>> >>>>> >>>>>> Also I hear that array task dependencies could be good for such >>>>>> tasks but not sure if I should use them. >>>>> >>>>> I think this is not suitable for your setup. E.g. you have two array >>>>> jobs A and B, each running from [1..10]. Then B[1] should be allowed >>>>> to start as soon as A[1] finished. Similar for the other 9 job >>>>> indices. >>>>> >>>>> -- Reuti >>>>> >>>>> >>>>>> Any advice or suggestions would be much appreciated. >>>>>> >>>>>> >>>>>> Many Thanks, >>>>>> Gaya >>>>>> >>>>>> >>>>>> -- >>>>>> The University of Edinburgh is a charitable body, registered in >>>>>> Scotland, with registration number SC005336. >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> [email protected] >>>>>> https://gridengine.org/mailman/listinfo/users >>>>> >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> The University of Edinburgh is a charitable body, registered in >>>> Scotland, with registration number SC005336. >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> https://gridengine.org/mailman/listinfo/users >>>> >>>> >>> >>> >> >> >> >> -- >> The University of Edinburgh is a charitable body, registered in >> Scotland, with registration number SC005336. >> >> >> _______________________________________________ >> users mailing list >> [email protected] >> https://gridengine.org/mailman/listinfo/users > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
