Hi,

Am 28.08.2013 um 10:40 schrieb Guillermo Marco Puche:

> I've been experiencing some weird behavior with Picard tools (bioinformatic 
> tool on Java). 
> 
>       • Job starts running
>       • Job gets to T status (threshold)

Do you mean the process state "T" or the SGE job state "T"?


>       • Job comes back to R status.

Do you use any checkpointing interface, to restart the job? If so, it should 
output "Rr" in `qstat` instead of a plain "R" for the SGE job state.


>       • Job stays in R status forever. The processes stay on compute node 
> without using resources.

In the list I see only "S" states.

-- Reuti

NB: Maybe it could help, to run these "suspend-sensible" jobs with a nice value 
of 19 ("priority 19" in the queue configuration), and normal job like usual at 
0.


> You can see the real process inside compute node in the following picture. As 
> I said they seem to do nothing, they just stay here.
> 
> http://imm.io/1gmXT
> 
> Thank you.
> 
> Best regards,
> Guillermo.
> On 07/29/2013 01:35 PM, Reuti wrote:
>> Hi,
>> 
>> Am 29.07.2013 um 13:07 schrieb Guillermo Marco Puche:
>> 
>> 
>>> I have set some subscribing rules. So cluster compute nodes have load 
>>> balanced. This way, grid engine put some jobs to a T state when a compute 
>>> node exceeds load rule.
>>> 
>>> The problem is I've some perl scripts that use MySQL connection after 
>>> resuming from a T state die because they lose the connection to MySQL. 
>>> 
>>> The question is.. Is there any way to exclude a job by name from suffering 
>>> this rule? So It will never enter T status and die after resume.
>>> 
>> unfortunately no.
>> 
>> Nevertheless I saw the need for some kind of "suspensible y/n" flag for a 
>> submitted job too:
>> 
>> 
>> https://arc.liv.ac.uk/trac/SGE/ticket/735
>> 
>> 
>> For your situation it could help to have a dedicated queue for these Perl 
>> scripts only, which will never get suspended.
>> 
>> -- Reuti
>> 
>> 
>> 
>>> Thank you.
>>> 
>>> Best regards,
>>> Guillermo.
>>> -- 
>>> _______________________________________________
>>> users mailing list
>>> 
>>> [email protected]
>>> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to