Am 01.03.2013 um 11:07 schrieb Dave Love:

> Kshitiz B <[email protected]> writes:
> 
>> How to distinguish between the following scenarios which leads to job 
>> deletion : 
>> 
>> 1. Slave Node Failure
>> 2. Master/Shepherd Node Failure

You mean a crash of the node?


>> 3. Job deleted by User
>> 4. Job deleted by Admin
> 
> Look in the qmaster messages file, or possibly via the reporting file --

A prerequisite might be to set in the SGE configuration:

$ qconf -sconf
...
loglevel                     log_info

to see `qdel` events.

To get most information about any violated limit too, all the messages files 
from the master and slave nodes must be checked for a specific job number too. 
Maybe you can initiate such an summarization when you want to issue the bill to 
the customer.

-- Reuti


> I don't remember details, but see the man page.
> 
>> Only  after figuring out which of the above scenario lead to the job 
>> deletion  , we will be able to do the correct billing of the customer .
>> 
>> qacct -j <jobid> gives :
>> 1. failed : but it does not cover error codes for above scenarios
>> 2. exit_status : how to use it in this relevance
> 
> Exit status depends on the job script/binary.
> _______________________________________________
> SGE-discuss mailing list
> [email protected]
> https://arc.liv.ac.uk/mailman/listinfo/sge-discuss


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to