Exit status 140 - some where i read on internet, excuse if it is wrong, May
 i get more details about this exit status and why this is killed with
signal 12. Actually nothing is in /default/spool/`hostname`/messages. i
found the messages only in qmaster/messages.

i found only one message in default/spool/`hostname`/messages

starting up SGE 6.2u5 (lx24-amd64)

Regards
PVK

On Thu, Sep 27, 2012 at 11:50 PM, Reuti <[email protected]> wrote:

> Am 27.09.2012 um 19:41 schrieb Vamsi Krishna:
>
> those were inputs for debugging.
>
> job 1058200.1 failed on host  assumedly after job because: job 1058200.1
> died through signal USR2 (12)
>
> 09/26/2012 17:47:02|worker|E|denied: job "1058200" does not exist
>
>
> 50 out of 80 batch jobs got killed in the similar way and also one of the
> job in queue was also killed., does qmaster needs reboot.
>
>
> On Thu, Sep 27, 2012 at 9:39 PM, Reuti <[email protected]> wrote:
>
>> Am 26.09.2012 um 13:48 schrieb Vamsi Krishna:
>>
>> *Exit code 140:* The job exceeded the "wall clock" time limit, h_rt is
>> setto infinity
>>
>>
> Who stated that exit code 140 is "wall clock" exceeded and nothing else?
> Did you verify it in the messages file of the shepherd on the node's
> spooling directory?
>
> -- Reuti
>
>
> submit with -notify by default.
>>
>>
>> Is this a statement or a question? There can be more reasons for SIGUSR2
>> like a passed memory limit as a result of -notify, or it can only be warned
>> as someone killed the job with a `qdel`.
>>
>> How can it run into h_rt when it's set to infinity?
>>
>> -- Reuti
>>
>>
>>
>> --PVK
>>
>> On Wed, Sep 26, 2012 at 12:46 PM, Reuti <[email protected]>wrote:
>>
>>> Am 26.09.2012 um 08:53 schrieb Vamsi Krishna:
>>>
>>> > some of the batch jobs are killed and qacct -j of the job id
>>> >
>>> > failed       100 : assumedly after job
>>> > exit_status  140
>>>
>>> It's 128 + 12 = SIGUSR2. So what can cause this signal to be generated?
>>>
>>> Something in your job?
>>>
>>> You submit with -notify?
>>>
>>> -- Reuti
>>>
>>>
>>> >
>>> >
>>> > what could be the reason.
>>> >
>>> > Regards
>>> > PVK
>>> >
>>> > _______________________________________________
>>> > users mailing list
>>> > [email protected]
>>> > https://gridengine.org/mailman/listinfo/users
>>>
>>>
>>
>>
>
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to