On Thu, 16 Aug 2012 at 12:07 -0000, Brian Smith wrote:
> {
> name host_slotcap
> description make sure only the right number of slots get used
> enabled TRUE
> limit queues * hosts {*} to slots=$num_proc
> }
I used to have a rule similar to this (I didn't have 'queues *'
clause). I found that disabling the rule improved my scheduling
performance by a huge amount (several minutes became a few seconds).
You might try disabling this rule briefly and see if your scheduling
performance changes.
I'm still using 6.2u5, it is possible bugs have been fixed in other
versions.
I have queues defined to provide small, medium and large jobs (based
upon run time). Limits on the queues determine which jobs will run on
which queue. The significant definitions are:
% qconf -sq small
qname small
hostlist @small
seq_no 20
s_rt 4:00:00
h_rt 4:00:00
%
% qconf -sq medium
qname medium
hostlist @medium
seq_no 30
s_rt 48:00:00
h_rt 48:00:00
%
% qconf -ssconf
queue_sort_method load
load_formula seq_no*100+m_core-slots
default_duration 48:00:00
%
seq_no controls the order the queues are searched. qmaster will
search until it finds a queue which can run the job so the most
limiting queus should be first.
The size of the different queues is controlled by putting hosts in
different host groups. hosts become dedicated to jobs of a particular
size (or smaller). This does allow "small" jobs to run on host for
"large" jobs. In our case this is acceptable size the smaller jobs
will finish in a reasonable amount of time if large jobs are queued.
No jsv is required and users should not specify the queue, just the
run time limit.
We can adjust the run time limits and number of hosts in each host
group over time to best match the workload.
Stuart
--
I've never been lost; I was once bewildered for three days, but never lost!
-- Daniel Boone
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users