Am 04.06.2014 um 17:11 schrieb William Hay:

> On Wed, 4 Jun 2014 07:56:14 +0000
> "Thomas Achmann (PDF)" <[email protected]> wrote:
> 
>> Hi,
>> 
>> on our SGE-8.1.6 cluster we have implemented fair-share usage w/
>> functional policy configuration.
>> As the only limitation we allow to dispatch a maximum of four jobs
>> concurrently due to license restrictions
>> of our tool.
>> 
>> Fair-share works perfectly fine as long as there are only 4 users
>> submitting jobs.
>> 
>> It looks like this policy fails, as soon as there are more than 4
>> users submitting jobs.
>> Any new user's jobs (e.g. 5th user's job) is put into waiting until
>> all prior submitted jobs of
>> one of the first four users are completely finished.
> It looks from the config you quote like you are using functional share
> rather than fair share.  Functional share doesn't take past usage into
> account but sorts jobs from users of equal current usage into
> submission order.   A job from a user sole whose job has just finished
> will therefore have higher priority than any job submitted after it.

Maybe it's a matter of definition: to me fair share means that all users get 
the same amount of cpu time in the cluster at a given point in time. 
Oversubscribing the available cores, SGE could also change the nice values of 
the jobs to achieve this goal even then (by setting "reprioritize_interval" in 
the scheduler configuration).

And as you pointed out: disregarding any past usage.


> I suggest you give some small weight to fair share
> to help light users.  This would be done by setting weight_tickets_share
> to some positive value.

This will then enable a share tree policy being taken into account, i.e. past 
usage. But I spot something too:

weight_job

shouldn't be zero here. For me the four weight_* entries for the functional 
policy are all set to 0.25 by default.

-- Reuti


> William
> 
>> 
>> Any help to make fair-share working for more than 4 users is greatly
>> appreciated.
>> 
>> I'm attaching scheduler config settings. Please let me know, if you
>> need more details.
>> 
>> Kind regards,
>> 
>> Thomas Achmann
>> 
>> qconf -ssconf
>> algorithm                         default
>> schedule_interval                 0:0:10
>> maxujobs                          0
>> queue_sort_method                 load
>> job_load_adjustments              NONE
>> load_adjustment_decay_time        0:0:0
>> load_formula                      -slots
>> schedd_job_info                   true
>> flush_submit_sec                  0
>> flush_finish_sec                  0
>> params                            none
>> reprioritize_interval             0:0:0
>> halftime                          168
>> usage_weight_list
>> cpu=1.000000,mem=0.000000,io=0.000000
>> compensation_factor               5.000000
>> weight_user                       1.000000
>> weight_project                    0.000000
>> weight_department                 0.000000
>> weight_job                        0.000000
>> weight_tickets_functional         1000000
>> weight_tickets_share              0 share_override_tickets
>> TRUE share_functional_shares           TRUE
>> max_functional_jobs_to_schedule   2000
>> report_pjob_tickets               TRUE
>> max_pending_tasks_per_job         50
>> halflife_decay_list               none
>> policy_hierarchy                  OFS
>> weight_ticket                     10.000000
>> weight_waiting_time               0.000000
>> weight_deadline                   3600000.000000
>> weight_urgency                    0.000000
>> weight_priority                   0.000000
>> max_reservation                   0
>> default_duration                  INFINITY
>> 
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to