Re: [gridengine users] Queue concept and resource management for large jobs

Hung-sheng Tsao Mon, 06 Aug 2012 00:37:06 -0700

They just suggest that you put 4 job in one script 
Regards

Sent from my iPhone


On Aug 6, 2012, at 3:26 AM, Gaya Nadarajan <[email protected]> wrote:

> Quoting Reuti <[email protected]> on Fri, 3 Aug 2012 12:16:02 +0200:
> 
>> Hi,
>> 
>> Am 03.08.2012 um 11:55 schrieb Gaya Nadarajan:
>> 
>>> I'm relatively new to sge, however I have the task of making optimal use of 
>>> 7 multi-core VM nodes (5 in a cluster and 2 individual) for a set of tasks. 
>>> The grid engine is installed in one of the individual VMs which is serving 
>>> as the master node.
>>> 
>>> I have a set of independent jobs (in hundreds or thousands) which each have 
>>> 3 subtasks that have to be run in sequence). I would like a node to be 
>>> assigned to the chunk of 3 jobs so that they can share the data between 
>>> them.
>>> 
>>> Reading the sge manual suggests that a resource is really a queue.
>> 
>> Yes, you can say so. You submit a job and SGE selects a queue instance (i.e. 
>> a queue on a particular exechost) for you which fulfills the resource 
>> requests you specified.
>> 
>> 
>>> Should I create many queues, each with the 3 subtasks so that the data and 
>>> task dependencies could be dealt with?
>> 
>> Usually it's best to have as few queues as possible.
>> 
>> Running the jobs in sequence is no problem, you can use  -hold_jid for it. 
>> But the "problem" is the temporary data. I assume, you want the temporary 
>> data on the local disk and not a shared file space, where it can be accessed 
>> by all nodes. The best would be to assemble one job from your 3 subtasks 
>> instead. This way you can also use the $TMPDIR, which is created and removed 
>> by SGE - preferable on a local /scratch file space.
>> 
> 
> What you are saying is merge the 3 subtasks into one job? Is there anyway 
> this can be expressed in sge job submission?
> 
> You are right regarding the data dependency, all 3 subtasks depend on the 
> same data which is downloaded locally. While I can use hold_jid to state the 
> dependency between subtask 1 and 2, the command for subtask 3 can only be 
> generated after subtask 2 has finished executing. The parameters to subtask 3 
> rely on the result produced by subtask 2. So I can't even run it with 
> hold_jib option because the command itself is not complete due to missing 
> parameters.
> 
> 
> Any thoughts on these? Thanks.
> 
> 
>> Otherwise you would need to tell the other two jobs to run on the same node 
>> the first task was scheduled to (or route all by hand during the submission 
>> to individual particular exechosts).
>> 
>> 
>>> Also I hear that array task dependencies could be good for such tasks but 
>>> not sure if I should use them.
>> 
>> I think this is not suitable for your setup. E.g. you have two array jobs A 
>> and B, each running from [1..10]. Then B[1] should be allowed to start as 
>> soon as A[1] finished. Similar for the other 9 job indices.
>> 
>> -- Reuti
>> 
>> 
>>> Any advice or suggestions would be much appreciated.
>>> 
>>> 
>>> Many Thanks,
>>> Gaya
>>> 
>>> 
>>> --
>>> The University of Edinburgh is a charitable body, registered in
>>> Scotland, with registration number SC005336.
>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users
>> 
>> 
>> 
> 
> 
> 
> -- 
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> 
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] Queue concept and resource management for large jobs

Reply via email to