Am 20.09.2012 um 02:08 schrieb Joseph Farran:

> What is the recommended way and/or do scripts exists for cleaning up once a 
> job completes/dies/crashes on a node?
> 
> I would prefer to do this via the epilog script.

So, the job crashes but continue running?


> I am having a situation where programs go south and are left running on nodes 
> when Grid Engine thinks they are no longer running and do not exists in the 
> GE queue.
> 
> I can't be the first asking for such a thing, so I don't want to re-invent 
> the wheel if some script or way already exists for doing this that works.

Are the processes jumping out of the process tree and are no longer bound to 
the sge_shepherd? One thing you an try is:

$ qconf -sconf
...
execd_params                 ENABLE_ADDGRP_KILL=TRUE

-- Reuti
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to