Am 22.02.2011 um 15:51 schrieb Andreas Haupt: > Hi Reuti and Richard, > > hmm, actually the scheduler configuration should be correct for such a > setup. But maybe I just can't see the wood for the trees ... > > [oreade38] ~ % qconf -ssconf > algorithm default > schedule_interval 0:0:1
Doesn't put this a high load on the qmaster? Especially when you have a low value for ... > maxujobs 0 > queue_sort_method load > job_load_adjustments np_load_avg=1.0 > load_adjustment_decay_time 0:7:30 > load_formula np_load_avg > schedd_job_info true > flush_submit_sec 1 > flush_finish_sec 1 ... the flush settings. I think the defaults for max scheduling are 0:20 and have the flush settings set to 4. > params none > reprioritize_interval 0:0:0 > halftime 24 > usage_weight_list cpu=1.000000,mem=0.000000,io=0.000000 > compensation_factor 5.000000 > weight_user 1.000000 > weight_project 0.000000 > weight_department 0.000000 > weight_job 1.000000 > weight_tickets_functional 1000 > weight_tickets_share 10000 > share_override_tickets FALSE > share_functional_shares TRUE > max_functional_jobs_to_schedule 1000 > report_pjob_tickets TRUE > max_pending_tasks_per_job 50 > halflife_decay_list none > policy_hierarchy FS > weight_ticket 0.500000 > weight_waiting_time 0.000000 > weight_deadline 3600000.000000 > weight_urgency 0.000000 > weight_priority 1.000000 > max_reservation 250 > default_duration 9999:00:00 These settings look fine. > Do you see a common mistake here? There are < 100 waiting jobs in the > queue most of the time. When I'm aware that there are always waiting jobs, the flush_submit_sec could even be higher, as there are most likely no free slots anyway. But this shouldn't influence the odd behavior you observe. With smaller parallel jobs which are waiting to get their slots it's working? To investigate, you could also try to submit an advance reservation for some point in the future (unfortunately there is no option to `qrsub` to request it without a given start time [and I don't mean "now" here], but you get the earliest time output when it could be granted). Is such a reservation granted in your case? -- Reuti > Thanks, > Andreas > > On Tue, 2011-02-22 at 15:38 +0100, Richard Ems wrote: >> On 02/22/2011 03:07 PM, Andreas Haupt wrote: >>> Do you see a similar behaviour? Is it a misconfiguration? Anything I >>> could do (apart from watching the queue regularly and schedule "by >>> hand" ...)? >> >> We use the same GE version and have a "similar" configuration, but we >> don't start parallel jobs on that many slots. >> >> Could it be that max_reservation is set too low? >> >> Regards, Richard >> >> >> > -- > | Andreas Haupt | E-Mail: [email protected] > | DESY Zeuthen | WWW: http://www-zeuthen.desy.de/~ahaupt > | Platanenallee 6 | Phone: +49/33762/7-7359 > | D-15738 Zeuthen | Fax: +49/33762/7-7216 > > _______________________________________________ > users mailing list > [email protected] > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
