On 26 February 2013 13:23, Arnau Bria <[email protected]> wrote: > On Tue, 26 Feb 2013 11:52:34 +0100 > Reuti Reuti wrote: > > Hi, > > [...] > >> Well, it's there but there is no reservation which can be backfilled. >> As soon as one of the waiting jobs can start it will start. This >> leads to the observed behavior that some parallel jobs (or a job with >> a huge memory request) will die of starvation. >> >> And to allow backfilling to work correctly, it's best to submit all >> jobs with an estimated runtime. If the default runtime is used and >> overdrawn all computed reservation might become worthless. > > Ok, I think I'm understanding some more concepts but still seeing the > original problem. > > I've added a default queue time: > > s_rt 168:00:00 > h_rt 172:00:00 > > so know backfilling could work > > I submit a parallel job with "-R y" and many single jobs without > -R (default is n), but the job never starts... > > I cannot see if any reservation if there cause schedule file is empty > (yesterday I changed MONITOR=1 in sched conf), so, maybe it's not > working? > > > ** so, as you can see, I'm still missing some conf (or not understanding > the problem/solution pair). > Did you add a default_duration in sched_conf as Reuti suggested earlier(ie using qconf -msconf)? I believe the settings in the queue configuration are enforced but not used for prediction. Alternatively setting s_rt or h_rt in the global sge_request file should work.
William > >> -- Reuti > Cheers, > Arnau > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users > > _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
