Hi, > Am 15.04.2016 um 23:12 schrieb Happy Monk <gascan...@gmail.com>: > > Is it possible to make advanced reservation default for all jobs in a queue ?
Well, the global "sge_request" could do it, but then it would be for all jobs. If there are dedicated users using the high priority queue they could have their personal ".sge_request" in their home directory or even specific subdirectories. Another way would be to implement a JSV (job submission verifier) and attach "-R y" only to certain jobs depending on some criteria. -- Reuti > Any disadvantages to it ? > Yes, mem is already a consumable. > > Thanks, > > On Fri, Apr 15, 2016 at 11:47 AM, Christopher Black <cbl...@nygenome.org> > wrote: > If your jobs or queues have an h_rt specified, you can look into advanced > reservation and submitting large memory jobs with -R y. You will likely want > to look into tweaking max*reservation* and default_duration parameters via > qconf -mconf/msconf. Utilizing advanced reservation puts more load on the > qmaster/scheduler but allows it to prevent smaller jobs from flooding out > large jobs when only small portions of nodes become available. > > Other options are using qhold or disabling the all.q queue instance on many > nodes when there is a backlog of high.q jobs. > > Also, if you haven't already you may want to look into making mem a > consumable resource (based on your qconf -se output you may have already done > this). > > Best, > Chris > > > > > On 4/14/16, 8:10 PM, "users-boun...@gridengine.org on behalf of Happy Monk" > <users-boun...@gridengine.org on behalf of gascan...@gmail.com> wrote: > > >Hi, > > > > > >We are using Open grid scheduler/Grid Engine version 2011.11p1 > > > >Currently have two queues, with identical settings except priority. > > > > > >all.q -- default queue > > > >high.q ---- higher priority > > > > > >The scheduler is set to least nodes used policy. All our nodes have > >identical resources, 30 cores, 120GB RAM. Scheduler is working as expected > >when submitting jobs with low resource requests as per queue priorities. But > >when a high mem job (50+GB) is submitted > > in high.q, it gets stuck in queue waiting forever, as low mem jobs from > > default.q are executed when ever a resource is available and scheduler is > > not able to fulfill high mem job requirements even though it is of higher > > priority. How can I make all jobs in > > default.q to wait until higher priority jobs finish ? > > > > > > > >Thanks, > > > > > >Here are the details of our GE config, > > > > > >root@master1: gridengine#qconf -ssconf > >algorithm default > >schedule_interval 0:0:05 > >maxujobs 0 > >queue_sort_method load > >job_load_adjustments np_load_avg=1.75 > >load_adjustment_decay_time 0:7:30 > >load_formula np_load_avg > >schedd_job_info true > >flush_submit_sec 0 > >flush_finish_sec 0 > >params none > >reprioritize_interval 0:0:0 > >halftime 168 > >usage_weight_list cpu=1.000000,mem=0.000000,io=0.000000 > >compensation_factor 5.000000 > >weight_user 0.250000 > >weight_project 0.250000 > >weight_department 0.250000 > >weight_job 0.250000 > >weight_tickets_functional 0 > >weight_tickets_share 0 > >share_override_tickets TRUE > >share_functional_shares TRUE > >max_functional_jobs_to_schedule 200 > >report_pjob_tickets TRUE > >max_pending_tasks_per_job 50 > >halflife_decay_list none > >policy_hierarchy OFS > >weight_ticket 0.010000 > >weight_waiting_time 0.000000 > >weight_deadline 3600000.000000 > >weight_urgency 0.100000 > >weight_priority 1.000000 > >max_reservation 64 > >default_duration 360:00:00 > > > >root@master1: gridengine#qconf -sq high.q > >qname high.q > >hostlist @allhosts > >seq_no 0 > >load_thresholds np_load_avg=3.0 > >suspend_thresholds NONE > >nsuspend 1 > >suspend_interval 00:05:00 > >priority -10 > >min_cpu_interval 00:05:00 > >processors UNDEFINED > >qtype BATCH INTERACTIVE > >ckpt_list NONE > >pe_list make mpich mpi orte smp threaded > >rerun FALSE > >slots 1,[] > >tmpdir /tmp > >shell /bin/bash > >prolog NONE > >epilog NONE > >shell_start_mode posix_compliant > >starter_method NONE > >suspend_method NONE > >resume_method NONE > >terminate_method NONE > >notify 00:00:60 > >owner_list NONE > >user_lists NONE > >xuser_lists NONE > >subordinate_list NONE > >complex_values NONE > >projects NONE > >xprojects NONE > >calendar NONE > >initial_state default > >s_rt INFINITY > >h_rt INFINITY > >s_cpu INFINITY > >h_cpu INFINITY > >s_fsize INFINITY > >h_fsize INFINITY > >s_data INFINITY > >h_data INFINITY > >s_stack 20971520 > >h_stack 104857600 > >s_core INFINITY > >h_core 0 > >s_rss INFINITY > >h_rss INFINITY > >s_vmem INFINITY > >h_vmem INFINITY > > > >root@master1: gridengine#qconf -sq all.q > >qname all.q > >hostlist @allhosts > >seq_no 0 > >load_thresholds np_load_avg=3.0 > >suspend_thresholds NONE > >nsuspend 1 > >suspend_interval 00:05:00 > >priority 0 > >min_cpu_interval 00:05:00 > >processors UNDEFINED > >qtype BATCH INTERACTIVE > >ckpt_list NONE > >pe_list make mpich mpi orte smp threaded > >rerun FALSE > >slots 1,[] > >tmpdir /tmp > >shell /bin/bash > >prolog NONE > >epilog NONE > >shell_start_mode posix_compliant > >starter_method NONE > >suspend_method NONE > >resume_method NONE > >terminate_method NONE > >notify 00:00:60 > >owner_list NONE > >user_lists NONE > >xuser_lists NONE > >subordinate_list NONE > >complex_values NONE > >projects NONE > >xprojects NONE > >calendar NONE > >initial_state default > >s_rt INFINITY > >h_rt INFINITY > >s_cpu INFINITY > >h_cpu INFINITY > >s_fsize INFINITY > >h_fsize INFINITY > >s_data INFINITY > >h_data INFINITY > >s_stack 20971520 > >h_stack 104857600 > >s_core INFINITY > >h_core 0 > >s_rss INFINITY > >h_rss INFINITY > >s_vmem INFINITY > >h_vmem INFINITY > > > >root@master1: gridengine#qconf -se compute-2-1 > >hostname compute-2-1.local > >load_scaling NONE > >complex_values slots=30,h_vmem=120G,io_slots=30 > >load_values arch=linux-x64,num_proc=32,mem_total=129169.750000M, \ > > swap_total=31983.871094M,virtual_total=161153.621094M, > > \ > > load_avg=21.680000,load_short=21.950000, \ > > load_medium=21.680000,load_long=21.480000, \ > > mem_free=102849.832031M,swap_free=31983.871094M, \ > > virtual_free=134833.703125M,mem_used=26319.917969M, \ > > swap_used=0.000000M,virtual_used=26319.917969M, \ > > cpu=65.300000, \ > > > > m_topology=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT, \ > > > > m_topology_inuse=SCTTCTTCTTCTTCTTCTTCTTCTTSCTTCTTCTTCTTCTTCTTCTTCTT, \ > > m_socket=2,m_core=16,np_load_avg=0.677500, \ > > np_load_short=0.685937,np_load_medium=0.677500, \ > > np_load_long=0.671250 > >processors 32 > >user_lists NONE > >xuser_lists NONE > >projects NONE > >xprojects NONE > >usage_scaling NONE > >report_variables NONE > > > >root@squid: master1#qconf -sp threaded > >pe_name threaded > >slots 9999 > >user_lists NONE > >xuser_lists NONE > >start_proc_args /bin/true > >stop_proc_args /bin/true > >allocation_rule $pe_slots > >control_slaves FALSE > >job_is_first_task TRUE > >urgency_slots min > >accounting_summary FALSE > > > > > > > > > > > > > > > This electronic message is intended for the use of the named recipient only, > and may contain information that is confidential, privileged or protected > from disclosure under applicable law. If you are not the intended recipient, > or an employee or agent responsible for delivering this message to the > intended recipient, you are hereby notified that any reading, disclosure, > dissemination, distribution, copying or use of the contents of this message > including any of its attachments is strictly prohibited. If you have received > this message in error or are not the named recipient, please notify us > immediately by contacting the sender at the electronic mail address noted > above, and destroy all copies of this message. Please note, the recipient > should check this email and any attachments for the presence of viruses. The > organization accepts no liability for any damage caused by any virus > transmitted by this email. > > _______________________________________________ > users mailing list > users@gridengine.org > https://gridengine.org/mailman/listinfo/users _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users