I'm running this on a development cluster and testing implementing h_rt limits and job status email functionality.
Job 187 (mpirun) Aborted Exit Status = 0 Signal = KILL User = lanew Queue = short.q@cscld1-0-2 Host = cscld1-0-2.local Start Time = 09/22/2015 23:18:58 End Time = 09/22/2015 23:19:00 CPU = 00:00:00 Max vmem = 10.229M failed assumedly after job because: job 187.1 died through signal KILL (9) I had thought an exit status of 0 indicates normal termination? >From the accounting file man page "For example: If a job dies through signal 9 (SIGKILL) then the exit status becomes 128 + 9 = 137." IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is strictly prohibited. Thank you for your cooperation.
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users