Am 28.08.2013 um 11:00 schrieb Guillermo Marco Puche:

> On 08/28/2013 10:57 AM, Reuti wrote:
>> Hi,
>> 
>> Am 28.08.2013 um 10:40 schrieb Guillermo Marco Puche:
>> 
>> 
>>> I've been experiencing some weird behavior with Picard tools (bioinformatic 
>>> tool on Java). 
>>> 
>>>     • Job starts running
>>>     • Job gets to T status (threshold)
>>> 
>> Do you mean the process state "T" or the SGE job state "T"?
> T in SGE.

Ok, the Threshold was triggered by the load being too high?


>> 
>> 
>> 
>>>     • Job comes back to R status.
>>> 
>> Do you use any checkpointing interface, to restart the job? If so, it should 
>> output "Rr" in `qstat` instead of a plain "R" for the SGE job state.
>> 
>> 
> No, I don't use any checkpointing interface.

Then the state should be "r".

In total: the processes are suspended in the correct way (and reach state "T" 
also in `ps -e f), but after the `kill -cont ...` to wake them up they become 
sleeping?

-- Reuti


>>>     • Job stays in R status forever. The processes stay on compute node 
>>> without using resources.
>>> 
>> In the list I see only "S" states.
>> 
>> -- Reuti
>> 
>> NB: Maybe it could help, to run these "suspend-sensible" jobs with a nice 
>> value of 19 ("priority 19" in the queue configuration), and normal job like 
>> usual at 0.
>> 
>> 
>> 
>>> You can see the real process inside compute node in the following picture. 
>>> As I said they seem to do nothing, they just stay here.
>>> 
>>> 
>>> http://imm.io/1gmXT
>>> 
>>> 
>>> Thank you.
>>> 
>>> Best regards,
>>> Guillermo.
>>> On 07/29/2013 01:35 PM, Reuti wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Am 29.07.2013 um 13:07 schrieb Guillermo Marco Puche:
>>>> 
>>>> 
>>>> 
>>>>> I have set some subscribing rules. So cluster compute nodes have load 
>>>>> balanced. This way, grid engine put some jobs to a T state when a compute 
>>>>> node exceeds load rule.
>>>>> 
>>>>> The problem is I've some perl scripts that use MySQL connection after 
>>>>> resuming from a T state die because they lose the connection to MySQL. 
>>>>> 
>>>>> The question is.. Is there any way to exclude a job by name from 
>>>>> suffering this rule? So It will never enter T status and die after resume.
>>>>> 
>>>>> 
>>>> unfortunately no.
>>>> 
>>>> Nevertheless I saw the need for some kind of "suspensible y/n" flag for a 
>>>> submitted job too:
>>>> 
>>>> 
>>>> 
>>>> https://arc.liv.ac.uk/trac/SGE/ticket/735
>>>> 
>>>> 
>>>> 
>>>> For your situation it could help to have a dedicated queue for these Perl 
>>>> scripts only, which will never get suspended.
>>>> 
>>>> -- Reuti
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> Thank you.
>>>>> 
>>>>> Best regards,
>>>>> Guillermo.
>>>>> -- 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> 
>>>>> 
>>>>> [email protected]
>>>>> https://gridengine.org/mailman/listinfo/users
> 
> 
> -- 
> Guillermo Marco Puche
> 
> Bioinformatician, Computer Science Engineer.
> Sistemas Genómicos S.L.
> Phone: +34 902 364 669
> Fax: +34 902 364 670
> www.sistemasgenomicos.com
> 
>  <bioinfo.png> 
> 


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to