Am 02.11.2012 um 15:56 schrieb William Hay:

> I submitted an array job with -r y.  One of the tasks was transferring to a 
> node (state t) when that node went down but despite 
> max_unheard+reschedule_unknown being exceeded neither that task nor another 
> task on the same node was rescheduled.  A manual qmod -rq seems to work but 
> just working would be better.

But if the node crashes while all jobs are state "r" it working for you - there 
was no checkpointing environment in the way?

The array task was still shown in state "t" all the time?


> Is this a known problem?

It's hard to provoke.

- Reuti
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to