Hi,
Am 29.09.2011 um 15:46 schrieb [email protected]:
Check the "job_is_first_task" variable in your PE definition. That
changes the behavior of the master job with respect to SGE.
correct. But to clarify this: the output shows only what is granted to
run where. It doesn't show what is really running there. In fact: if
it would really reflect the actual used allocation, it could get
confusing when you have a parallel jobscript which have several serial
steps inside. You would see slave allocations come and go.
Whether your job needs "job_is_first_task" depends on the actual used
library. In short: If set to "no" it allows an additional local `qrsh -
inherit ...` to be issued by the jobscript on the master node of the
parallel job (i.e. one `qrsh -inherit ...` for each granted slot).
Some parallel libraries nowadays use threads instead of a local `qrsh
--inherit ...`. Or even for each node they are happy with one `qrsh -
inherit ...`, although SGE would allow more if they got several slots
on a slave node. Unfortunately there is no setting right now to
reflect this.
https://arc.liv.ac.uk/trac/SGE/ticket/197
-- Reuti
--g
From: Mohamed Adel <[email protected]>
To: "[email protected]" <[email protected]>
Date: 09/29/2011 05:40 AM
Subject: [gridengine users] submitting paralle jobs
Sent by: [email protected]
Dear all,
Does anyone know why when submitting a parallel job using a PE, the
master job runs as an extra job and not counted for.
For example, when submitting a job which asks for 8 cores, I see 9
jobs running, using “qstat –g t”, a master job and 8 slaves!
Thanks in advance,
--ma_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users