Am 01.03.2013 um 08:16 schrieb S Barve: > We are facing the same issue. Apparently, the signal sent by 'qdel' is > SIGKILL regardless of whether jobs are terminated by users or by > administrators. > > We tried a couple of things to distinguish between a job killed by a user > using 'qdel' and a job killed by the administrator using 'qdel' : > > 1) Change 'terminate_method' for the user's queue to "SIGTERM" and have the > user submit a job with the '-notify' flag. However, the same signal is > recorded in the job output file. > > 2) Change 'terminate_method' for the user's queue to point to a custom script > for killing jobs. We try to catch the user id of the calling process (qdel) > in the script. However, that user ID is reported as '0' whether the qdel > command is invoked by the user or by the administrator. > > Is there a way to know which user has invoked the qdel command? That might > help us figure out who killed the job.
The user who initiated the `qdel` is recorded in the messages file of the qmaster as info (adjust SGE's configuration to have "log_level log_info" set): 03/01/2013 11:38:53|worker|pc15370|I|reuti has registered the job 5658 for deletion -- Reuti > Thanks and regards, > Saurabh Barve > Pune,Maharashtra > India > Mailto: [email protected] > Website: http://www.tcs.com > ____________________________________________ > Experience certainty. IT Services > Business Solutions > Outsourcing > ____________________________________________ > > > From: Kshitiz B <[email protected]> > To: [email protected] > Date: 03/01/2013 11:22 AM > Subject: [gridengine users] Different Error Codes for Job Failure > Sent by: [email protected] > > > > > How to distinguish between the following scenarios which leads to job > deletion : > > 1. Slave Node Failure > 2. Master/Shepherd Node Failure > 3. Job deleted by User > 4. Job deleted by Admin > > Only after figuring out which of the above scenario lead to the job deletion > , we will be able to do the correct billing of the customer . > > qacct -j <jobid> gives : > 1. failed : but it does not cover error codes for above scenarios > 2. exit_status : how to use it in this relevance > =====-----=====-----===== > Notice: The information contained in this e-mail > message and/or attachments to it may contain > confidential or privileged information. If you are > not the intended recipient, any dissemination, use, > review, distribution, printing or copying of the > information contained in this e-mail message > and/or attachments to it are strictly prohibited. If > you have received this communication in error, > please notify us by reply e-mail or telephone and > immediately and permanently delete the message > and any attachments. Thank you_______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
