On Thu, 16 Aug 2012 at 12:07 -0000, Brian Smith wrote:

> {
>    name         host_slotcap
>    description  make sure only the right number of slots get used
>    enabled      TRUE
>    limit        queues * hosts {*} to slots=$num_proc
> }

I used to have a rule similar to this (I didn't have 'queues *'
clause).  I found that disabling the rule improved my scheduling
performance by a huge amount (several minutes became a few seconds).
You might try disabling this rule briefly and see if your scheduling
performance changes.

I'm still using 6.2u5, it is possible bugs have been fixed in other
versions.

I have queues defined to provide small, medium and large jobs (based
upon run time).  Limits on the queues determine which jobs will run on
which queue.  The significant definitions are:

% qconf -sq small
qname                 small
hostlist              @small
seq_no                20
s_rt                  4:00:00
h_rt                  4:00:00
%

% qconf -sq medium
qname                 medium
hostlist              @medium
seq_no                30
s_rt                  48:00:00
h_rt                  48:00:00
%

% qconf -ssconf
queue_sort_method                 load
load_formula                      seq_no*100+m_core-slots
default_duration                  48:00:00
%

seq_no controls the order the queues are searched.  qmaster will
search until it finds a queue which can run the job so the most
limiting queus should be first.

The size of the different queues is controlled by putting hosts in
different host groups.  hosts become dedicated to jobs of a particular
size (or smaller).  This does allow "small" jobs to run on host for
"large" jobs.  In our case this is acceptable size the smaller jobs
will finish in a reasonable amount of time if large jobs are queued.

No jsv is required and users should not specify the queue, just the
run time limit.

We can adjust the run time limits and number of hosts in each host
group over time to best match the workload.

Stuart
-- 
I've never been lost; I was once bewildered for three days, but never lost!
                                        --  Daniel Boone
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to