Quoting Reuti <[email protected]> on Fri, 3 Aug 2012 12:16:02 +0200:
Hi,
Am 03.08.2012 um 11:55 schrieb Gaya Nadarajan:
I'm relatively new to sge, however I have the task of making
optimal use of 7 multi-core VM nodes (5 in a cluster and 2
individual) for a set of tasks. The grid engine is installed in one
of the individual VMs which is serving as the master node.
I have a set of independent jobs (in hundreds or thousands) which
each have 3 subtasks that have to be run in sequence). I would like
a node to be assigned to the chunk of 3 jobs so that they can share
the data between them.
Reading the sge manual suggests that a resource is really a queue.
Yes, you can say so. You submit a job and SGE selects a queue
instance (i.e. a queue on a particular exechost) for you which
fulfills the resource requests you specified.
Should I create many queues, each with the 3 subtasks so that the
data and task dependencies could be dealt with?
Usually it's best to have as few queues as possible.
Running the jobs in sequence is no problem, you can use -hold_jid
for it. But the "problem" is the temporary data. I assume, you want
the temporary data on the local disk and not a shared file space,
where it can be accessed by all nodes. The best would be to assemble
one job from your 3 subtasks instead. This way you can also use the
$TMPDIR, which is created and removed by SGE - preferable on a local
/scratch file space.
What you are saying is merge the 3 subtasks into one job? Is there
anyway this can be expressed in sge job submission?
You are right regarding the data dependency, all 3 subtasks depend on
the same data which is downloaded locally. While I can use hold_jid to
state the dependency between subtask 1 and 2, the command for subtask
3 can only be generated after subtask 2 has finished executing. The
parameters to subtask 3 rely on the result produced by subtask 2. So I
can't even run it with hold_jib option because the command itself is
not complete due to missing parameters.
Any thoughts on these? Thanks.
Otherwise you would need to tell the other two jobs to run on the
same node the first task was scheduled to (or route all by hand
during the submission to individual particular exechosts).
Also I hear that array task dependencies could be good for such
tasks but not sure if I should use them.
I think this is not suitable for your setup. E.g. you have two array
jobs A and B, each running from [1..10]. Then B[1] should be allowed
to start as soon as A[1] finished. Similar for the other 9 job
indices.
-- Reuti
Any advice or suggestions would be much appreciated.
Many Thanks,
Gaya
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users