as a side remarksince you configure 3 queues for each host , in order not to over subscribe the cpu(h_slots)
you will need to define h_slots as consumable complex see this paper http://www.sun.com/blueprints/0607/820-1695.pdf Table 2-1. h_slots Settings complex_values h_slots=16,use_mem_size=32 Name Shortcut Type Relop Requestable Consumable Default Urgencyh_slots h_slots INT <= Forced Yes 0 0
for each execution host add complex_values h_slots=8, use_mem_size=<mem>
need qsub -l h_slots=n(could be 1) in submit job On 5/17/2011 3:25 PM, James Gladden wrote:
Dave,Thank you for your reply. I have setup the scheduler as you have suggested, but I still do not get the desired result.Here's my scheduler configuration: /[gladden@stuart ~]$ qconf -ssconf|grep load queue_sort_method load job_load_adjustments NONE load_adjustment_decay_time 0:7:30 load_formula -slots/ And here is the state of the hosts prior to doing a test job submission: /[gladden@stuart ~]$ qhost -qHOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS-------------------------------------------------------------------------------global - - - - - - - compute-1-1 lx26-amd64 8 0.00 23.5G 77.9M 7.8G 33.4Mserial.q BIP 0/8 stf.q BIP 0/8 all.q BIP 0/8compute-1-10 lx26-amd64 8 8.02 23.5G 15.5G 7.8G 32.9Mserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8compute-1-11 lx26-amd64 8 8.52 23.5G 19.7G 7.8G 5.7Gserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8compute-1-12 lx26-amd64 8 4.00 23.5G 1.1G 7.8G 24.0Mserial.q BIP 0/8 stf.q BIP 4/8 all.q BIP 0/8compute-1-13 lx26-amd64 8 7.00 23.5G 22.2G 7.8G 30.8Mserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8compute-1-14 lx26-amd64 8 7.11 23.5G 22.2G 7.8G 216.7Mserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8compute-1-15 lx26-amd64 8 8.35 23.5G 21.1G 7.8G 5.7Gserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8compute-1-16 lx26-amd64 8 7.38 23.5G 23.2G 7.8G 32.4Mserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8compute-1-17 lx26-amd64 8 7.38 23.5G 23.2G 7.8G 32.3Mserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8compute-1-18 lx26-amd64 8 0.00 23.5G 144.0M 7.8G 18.2Mall.q BIP 0/8 turecek.q BIP 0/8compute-1-19 lx26-amd64 8 0.00 23.5G 359.8M 7.8G 11.5Mall.q BIP 0/8 turecek.q BIP 0/8compute-1-2 lx26-amd64 8 8.03 23.5G 561.4M 7.8G 29.3Mserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8compute-1-20 lx26-amd64 8 0.00 23.5G 383.3M 7.8G 14.0Mall.q BIP 0/8 turecek.q BIP 0/8compute-1-21 lx26-amd64 8 8.00 23.5G 13.0G 7.8G 22.9Mrobinson.q BIP 8/8 all.q BIP 0/8compute-1-22 lx26-amd64 8 1.00 23.5G 288.9M 7.8G 30.4Mrobinson.q BIP 1/8 all.q BIP 0/8compute-1-23 lx26-amd64 8 8.00 23.5G 13.2G 7.8G 35.2Mrobinson.q BIP 8/8 all.q BIP 0/8compute-1-24 lx26-amd64 8 0.00 23.5G 146.5M 7.8G 25.0Mkhalil.q BIP 0/8 all.q BIP 0/8compute-1-3 lx26-amd64 8 8.05 23.5G 742.0M 7.8G 32.9Mserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8compute-1-4 lx26-amd64 8 0.00 23.5G 104.2M 7.8G 31.6Mserial.q BIP 0/8 stf.q BIP 0/8 all.q BIP 0/8compute-1-5 lx26-amd64 8 8.03 23.5G 8.4G 7.8G 23.4Mserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8compute-1-6 lx26-amd64 8 5.63 23.5G 1.3G 7.8G 25.0Mserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8compute-1-7 lx26-amd64 8 0.00 23.5G 155.7M 7.8G 33.3Mserial.q BIP 0/8 stf.q BIP 0/8 all.q BIP 0/8compute-1-8 lx26-amd64 8 0.00 23.5G 126.0M 7.8G 29.5Mserial.q BIP 0/8 stf.q BIP 0/8 all.q BIP 0/8compute-1-9 lx26-amd64 8 8.02 23.5G 1.3G 7.8G 32.7Mserial.q BIP 0/8 stf.q BIP 8/8 all.q BIP 0/8 /Please note in particular that node compute-1-1 is idle - none of the eight slots are in use by any of the queues. However, node compute-1-12 has four slots in use, with the other four slots available on the stf.q queue, which is the queue to which the test job will be submitted.Here is the submittal: [/gladden@stuart ~]$ qsub -q stf.q submit_test Your job 523949 ("test") has been submitted / And here is the result: /[gladden@stuart ~]$ qstat -u gladdenjob-ID prior name user state submit/start at queue slots ja-task-ID-----------------------------------------------------------------------------------------------------------------523949 0.55500 test gladden r 05/17/2011 11:58:18 [email protected] 1 /The scheduler picked stf.q@compute-1-1 which was the unloaded node, instead of "packing" the job into one of the four available slots on compute-1-12 as was desired and expected. I should add that stf.q@compute-1-1 is the lowest sequence number instance in stf.q, so this looks like the job was assigned by sequence number rather than by our "-slots" load formula.Any suggestions? I have poked around in the archive without finding the error of my ways. BTW, why the (-) inversion in the load formula?Jim Dave Love wrote:Thats atypical requirement: $ qconf -ssconf|grep load job_load_adjustments NONE load_adjustment_decay_time 0:7:30 load_formula -slots See past discussion here and on the old list. Note that there is a bug which may prevent it working properly with parallel jobs._______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
<<attachment: laotsao.vcf>>
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
