Hi,

there is a man page "man sge_priority". Which policy do you intend to use: 
share-tree (honors past usage) or functional (current use), or both?

-- Reuti


> Am 25.02.2019 um 15:03 schrieb Kandalaft, Iyad (AAFC/AAC) 
> <iyad.kandal...@canada.ca>:
> 
> Hi all,
>  
> I recently implemented a fair share policy using share tickets.  I’ve been 
> monitoring the cluster for a couple of days using qstat -pri -ext -u “*” in 
> order to see how the functional tickets are working and it seems to have the 
> intended effect.  There are some anomalies where some running jobs have 0 
> tickets but still get scheduled since there’s free resources; I assume this 
> is normal.
>  
> I’ll admit that I don’t fully understand the scheduling as it’s somewhat 
> complex.  So, I’m hoping someone can review the configuration to see if they 
> can find any glaring issues such as conflicting options.
>  
> I created a share-tree and gave all users an equal value of 10:
> $ qconf -sstree
> id=0
> name=Root
> type=0
> shares=1
> childnodes=1
> id=1
> name=default
> type=0
> shares=10
> childnodes=NONE
>  
> I modified the scheduling by setting the weight_tickets_share to 1000000. I 
> reduced the weight_waiting_time weight_priority weight_urgency to well below 
> the weight_ticket (what are good values?).
> $ qconf -ssconf
> algorithm                         default
> schedule_interval                 0:0:15
> maxujobs                          0
> queue_sort_method                 seqno
> job_load_adjustments              np_load_avg=0.50
> load_adjustment_decay_time        0:7:30
> load_formula                      np_load_avg
> schedd_job_info                   false
> flush_submit_sec                  0
> flush_finish_sec                  0
> params                            none
> reprioritize_interval             0:0:0
> halftime                          168
> usage_weight_list                 cpu=0.700000,mem=0.200000,io=0.100000
> compensation_factor               5.000000
> weight_user                       0.250000
> weight_project                    0.250000
> weight_department                 0.250000
> weight_job                        0.250000
> weight_tickets_functional         0
> weight_tickets_share              1000000
> share_override_tickets            TRUE
> share_functional_shares           TRUE
> max_functional_jobs_to_schedule   200
> report_pjob_tickets               TRUE
> max_pending_tasks_per_job         50
> halflife_decay_list               none
> policy_hierarchy                  OFS
> weight_ticket                     0.500000
> weight_waiting_time               0.000010
> weight_deadline                   3600000.000000
> weight_urgency                    0.010000
> weight_priority                   0.010000
> max_reservation                   0
> default_duration                  INFINITY
>  
> I modified all the users to set the fshare to 1000
> $ qconf -muser XXX
>  
> I modified the general conf to auto_user_fsahre 1000 and 
> auto_user_delete_time 7776000 (90 days).  Halftime is set to the default 7 
> days (I assume I should increase this).  I don’t know if 
> auto_user_delete_time even matters.
> $ qconf -sconf
> #global:
> execd_spool_dir              /opt/gridengine/default/spool
> mailer                       /opt/gridengine/default/commond/mail_wrapper.py
> xterm                        /usr/bin/xterm
> load_sensor                  none
> prolog                       none
> epilog                       none
> shell_start_mode             posix_compliant
> login_shells                 sh,bash
> min_uid                      100
> min_gid                      100
> user_lists                   none
> xuser_lists                  none
> projects                     none
> xprojects                    none
> enforce_project              false
> enforce_user                 auto
> load_report_time             00:00:40
> max_unheard                  00:05:00
> reschedule_unknown           00:00:00
> loglevel                     log_info
> administrator_mail           none
> set_token_cmd                none
> pag_cmd                      none
> token_extend_time            none
> shepherd_cmd                 none
> qmaster_params               none
> execd_params                 ENABLE_BINDING=true ENABLE_ADDGRP_KILL=true \
>                              H_DESCRIPTORS=16K
> reporting_params             accounting=true reporting=true \
>                              flush_time=00:00:15 joblog=true sharelog=00:00:00
> finished_jobs                100
> gid_range                    20000-20100
> qlogin_command               /opt/gridengine/bin/rocks-qlogin.sh
> qlogin_daemon                /usr/sbin/sshd -i
> rlogin_command               builtin
> rlogin_daemon                builtin
> rsh_command                  builtin
> rsh_daemon                   builtin
> max_aj_instances             2000
> max_aj_tasks                 75000
> max_u_jobs                   0
> max_jobs                     0
> max_advance_reservations     0
> auto_user_oticket            0
> auto_user_fshare             1000
> auto_user_default_project    none
> auto_user_delete_time        7776000
> delegated_file_staging       false
> reprioritize                 0
> jsv_url                      none
> jsv_allowed_mod              ac,h,i,e,o,j,M,N,p,w
>  
> Thanks for your assistance.
>  
> Cheers
>  
> Iyad Kandalaft
> 
>  
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to