I'm running this on a development cluster and testing implementing
h_rt limits and job status email functionality.

Job 187 (mpirun) Aborted
Exit Status      = 0
Signal           = KILL
User             = lanew
Queue            = short.q@cscld1-0-2
Host             = cscld1-0-2.local
Start Time       = 09/22/2015 23:18:58
End Time         = 09/22/2015 23:19:00
CPU              = 00:00:00
Max vmem         = 10.229M
failed assumedly after job because:
job 187.1 died through signal KILL (9)

I had thought an exit status of 0 indicates normal termination?

>From the accounting file man page

          "For example: If a job dies through signal  9  (SIGKILL)
          then the exit status becomes 128 + 9 = 137."

IMPORTANT WARNING: This message is intended for the use of the person or entity 
to which it is addressed and may contain information that is privileged and 
confidential, the disclosure of which is governed by applicable law. If the 
reader of this message is not the intended recipient, or the employee or agent 
responsible for delivering it to the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this information is 
strictly prohibited. Thank you for your cooperation.
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to