On 26 February 2013 13:23, Arnau Bria <[email protected]> wrote:
> On Tue, 26 Feb 2013 11:52:34 +0100
> Reuti Reuti wrote:
>
> Hi,
>
> [...]
>
>> Well, it's there but there is no reservation which can be backfilled.
>> As soon as one of the waiting jobs can start it will start. This
>> leads to the observed behavior that some parallel jobs (or a job with
>> a huge memory request) will die of starvation.
>>
>> And to allow backfilling to work correctly, it's best to submit all
>> jobs with an estimated runtime. If the default runtime is used and
>> overdrawn all computed reservation might become worthless.
>
> Ok, I think I'm understanding some more concepts but still seeing the
> original problem.
>
> I've added a default queue time:
>
> s_rt                  168:00:00
> h_rt                  172:00:00
>
> so know backfilling could work
>
> I submit a parallel job with "-R y" and many single jobs without
> -R (default is n), but the job never starts...
>
> I cannot see if any reservation if there cause schedule file is empty
> (yesterday I changed MONITOR=1 in sched conf), so, maybe it's not
> working?
>
>
> ** so, as you can see, I'm still missing some conf (or not understanding
> the problem/solution pair).
>
Did you add a default_duration in sched_conf as Reuti suggested
earlier(ie using qconf -msconf)?  I believe the settings in the queue
configuration are enforced but not used for prediction.
Alternatively setting s_rt or h_rt in the global sge_request file should work.

William

>
>> -- Reuti
> Cheers,
> Arnau
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users
>
>
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to