Thanks William, Reuti and Dave.
I will try the pointers made here.
Joseph
On 09/20/2012 02:13 AM, Reuti wrote:
Am 20.09.2012 um 02:08 schrieb Joseph Farran:
What is the recommended way and/or do scripts exists for cleaning up once a job
completes/dies/crashes on a node?
I would prefer to do this via the epilog script.
So, the job crashes but continue running?
I am having a situation where programs go south and are left running on nodes
when Grid Engine thinks they are no longer running and do not exists in the GE
queue.
I can't be the first asking for such a thing, so I don't want to re-invent the
wheel if some script or way already exists for doing this that works.
Are the processes jumping out of the process tree and are no longer bound to
the sge_shepherd? One thing you an try is:
$ qconf -sconf
...
execd_params ENABLE_ADDGRP_KILL=TRUE
-- Reuti
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users