Am 01.03.2013 um 08:16 schrieb S Barve:

> We are facing the same issue. Apparently, the signal sent by 'qdel' is 
> SIGKILL regardless of whether jobs are terminated by users or by 
> administrators. 
> 
> We tried a couple of things to distinguish between a job killed by a user 
> using 'qdel' and a job killed by the administrator using 'qdel' : 
> 
> 1) Change 'terminate_method' for the user's queue to "SIGTERM" and have the 
> user submit a job with the '-notify' flag. However, the same signal is 
> recorded in the job output file. 
> 
> 2) Change 'terminate_method' for the user's queue to point to a custom script 
> for killing jobs. We try to catch the user id of the calling process (qdel) 
> in the script. However, that user ID is reported as '0' whether the qdel 
> command is invoked by the user or by the administrator. 
> 
> Is there a way to know which user has invoked the qdel command? That might 
> help us figure out who killed the job. 

The user who initiated the `qdel` is recorded in the messages file of the 
qmaster as info (adjust SGE's configuration to have "log_level log_info" set):

03/01/2013 11:38:53|worker|pc15370|I|reuti has registered the job 5658 for 
deletion

-- Reuti


> Thanks and regards,
> Saurabh Barve
> Pune,Maharashtra
> India
> Mailto: [email protected]
> Website: http://www.tcs.com
> ____________________________________________
> Experience certainty.        IT Services
>                        Business Solutions
>                        Outsourcing
> ____________________________________________ 
> 
> 
> From: Kshitiz B <[email protected]>
> To:   [email protected]
> Date: 03/01/2013 11:22 AM
> Subject:      [gridengine users] Different Error Codes for Job Failure
> Sent by:      [email protected]
> 
> 
> 
> 
> How to distinguish between the following scenarios which leads to job 
> deletion : 
> 
> 1. Slave Node Failure
> 2. Master/Shepherd Node Failure
> 3. Job deleted by User
> 4. Job deleted by Admin
> 
> Only after figuring out which of the above scenario lead to the job deletion 
> , we will be able to do the correct billing of the customer .
> 
> qacct -j <jobid> gives :
> 1. failed : but it does not cover error codes for above scenarios
> 2. exit_status : how to use it in this relevance
> =====-----=====-----=====
> Notice: The information contained in this e-mail
> message and/or attachments to it may contain 
> confidential or privileged information. If you are 
> not the intended recipient, any dissemination, use, 
> review, distribution, printing or copying of the 
> information contained in this e-mail message 
> and/or attachments to it are strictly prohibited. If 
> you have received this communication in error, 
> please notify us by reply e-mail or telephone and 
> immediately and permanently delete the message 
> and any attachments. Thank you_______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
> 
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to