Hi Joe:

Per your request I confirm (with help from the e-mail that is sent when 
specifying ‘-m e’) that a regular ‘qdel job-ID’ uses KILL (9):


Job 5493527 (sleeper) Aborted

Exit Status      = 137

Signal           = KILL

User             = hughmac

Queue            = all.q@mysgehost00001

Host             = mysgehost00001

Start Time       = 01/23/2014 12:15:37

End Time         = 01/23/2014 12:15:45

CPU              = 00:00:00

Max vmem         = 204.609M

failed assumedly after job because:

job 5493527.1 died through signal KILL (9)

From the ‘qdel man’ page:

       -f     Force deletion of job(s). The job(s) are deleted  from  the  list 
 of
              jobs  registered  at sge_qmaster(8) even if the sge_execd(8) 
control-
              ling the job(s) does not  respond  to  the  delete  request  sent 
 by
              sge_qmaster(8).

So without the ‘-f’ a request for a KILL is sent from the master to the execd, 
and the job isn’t removed from the master DB if that request fails (but maybe 
it is when the execd comes back up on the node?), while with the ‘-f’ the job 
IS removed from the master DB even if the request for a kill sent to the execd 
by the master fails. I find ‘-f’ useful if an exec node is down for a long 
period of time and I want to clean up the output of qstat (after notifying 
users of the demise of their job(s)).

Cheers, Hugh

From: [email protected] [mailto:[email protected]] On 
Behalf Of Joe Borg
Sent: Thursday, January 23, 2014 12:10 PM
To: [email protected]
Subject: [gridengine users] What signal does qdel send to the job?

I thought it would be 15 for qdel and 9 for qdel -f.
But, it seems, that it's always 9, as I can't catch which ever signal is being 
sent.

Can someone confirm, please?


Regards,
Joseph David Borġ
josephb.org<http://josephb.org>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to