Am 01.03.2013 um 11:07 schrieb Dave Love: > Kshitiz B <[email protected]> writes: > >> How to distinguish between the following scenarios which leads to job >> deletion : >> >> 1. Slave Node Failure >> 2. Master/Shepherd Node Failure
You mean a crash of the node? >> 3. Job deleted by User >> 4. Job deleted by Admin > > Look in the qmaster messages file, or possibly via the reporting file -- A prerequisite might be to set in the SGE configuration: $ qconf -sconf ... loglevel log_info to see `qdel` events. To get most information about any violated limit too, all the messages files from the master and slave nodes must be checked for a specific job number too. Maybe you can initiate such an summarization when you want to issue the bill to the customer. -- Reuti > I don't remember details, but see the man page. > >> Only after figuring out which of the above scenario lead to the job >> deletion , we will be able to do the correct billing of the customer . >> >> qacct -j <jobid> gives : >> 1. failed : but it does not cover error codes for above scenarios >> 2. exit_status : how to use it in this relevance > > Exit status depends on the job script/binary. > _______________________________________________ > SGE-discuss mailing list > [email protected] > https://arc.liv.ac.uk/mailman/listinfo/sge-discuss _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
