Am 18.05.2011 um 01:34 schrieb Hung-ShengTsao (Lao Tsao) Ph.D.:

as a side remark
since you configure 3 queues for each host , in order not to over subscribe the cpu(h_slots)
you will need to define h_slots as consumable complex
see this paper http://www.sun.com/blueprints/0607/820-1695.pdf
Table 2-1. h_slots Settings
complex_values h_slots=16,use_mem_size=32
Name Shortcut Type Relop Requestable Consumable Default Urgency
h_slots h_slots INT <= Forced Yes 0 0
for each execution host add
complex_values        h_slots=8, use_mem_size=<mem>

Why do you want to define a new complex for this purpose? You can achieve the same with the normal "slots" complex and avoid forgetting to request this additonal one during your submissions.

I also checked the PDF you mentioned - I still see no need to use it.

-- Reuti


need
qsub -l h_slots=n(could be 1) in submit job

On 5/17/2011 3:25 PM, James Gladden wrote:

Dave,

Thank you for your reply. I have setup the scheduler as you have suggested, but I still do not get the desired result.

Here's my scheduler configuration:

[gladden@stuart ~]$ qconf -ssconf|grep load
queue_sort_method                 load
job_load_adjustments              NONE
load_adjustment_decay_time        0:7:30
load_formula                      -slots

And here is the state of the hosts prior to doing a test job submission:

[gladden@stuart ~]$ qhost -q
HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE SWAPTO SWAPUS
-------------------------------------------------------------------------------
global - - - - - - - compute-1-1 lx26-amd64 8 0.00 23.5G 77.9M 7.8G 33.4M
   serial.q             BIP   0/8
   stf.q                BIP   0/8
   all.q                BIP   0/8
compute-1-10 lx26-amd64 8 8.02 23.5G 15.5G 7.8G 32.9M
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8
compute-1-11 lx26-amd64 8 8.52 23.5G 19.7G 7.8G 5.7G
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8
compute-1-12 lx26-amd64 8 4.00 23.5G 1.1G 7.8G 24.0M
   serial.q             BIP   0/8
   stf.q                BIP   4/8
   all.q                BIP   0/8
compute-1-13 lx26-amd64 8 7.00 23.5G 22.2G 7.8G 30.8M
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8
compute-1-14 lx26-amd64 8 7.11 23.5G 22.2G 7.8G 216.7M
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8
compute-1-15 lx26-amd64 8 8.35 23.5G 21.1G 7.8G 5.7G
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8
compute-1-16 lx26-amd64 8 7.38 23.5G 23.2G 7.8G 32.4M
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8
compute-1-17 lx26-amd64 8 7.38 23.5G 23.2G 7.8G 32.3M
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8
compute-1-18 lx26-amd64 8 0.00 23.5G 144.0M 7.8G 18.2M
   all.q                BIP   0/8
   turecek.q            BIP   0/8
compute-1-19 lx26-amd64 8 0.00 23.5G 359.8M 7.8G 11.5M
   all.q                BIP   0/8
   turecek.q            BIP   0/8
compute-1-2 lx26-amd64 8 8.03 23.5G 561.4M 7.8G 29.3M
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8
compute-1-20 lx26-amd64 8 0.00 23.5G 383.3M 7.8G 14.0M
   all.q                BIP   0/8
   turecek.q            BIP   0/8
compute-1-21 lx26-amd64 8 8.00 23.5G 13.0G 7.8G 22.9M
   robinson.q           BIP   8/8
   all.q                BIP   0/8
compute-1-22 lx26-amd64 8 1.00 23.5G 288.9M 7.8G 30.4M
   robinson.q           BIP   1/8
   all.q                BIP   0/8
compute-1-23 lx26-amd64 8 8.00 23.5G 13.2G 7.8G 35.2M
   robinson.q           BIP   8/8
   all.q                BIP   0/8
compute-1-24 lx26-amd64 8 0.00 23.5G 146.5M 7.8G 25.0M
   khalil.q             BIP   0/8
   all.q                BIP   0/8
compute-1-3 lx26-amd64 8 8.05 23.5G 742.0M 7.8G 32.9M
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8
compute-1-4 lx26-amd64 8 0.00 23.5G 104.2M 7.8G 31.6M
   serial.q             BIP   0/8
   stf.q                BIP   0/8
   all.q                BIP   0/8
compute-1-5 lx26-amd64 8 8.03 23.5G 8.4G 7.8G 23.4M
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8
compute-1-6 lx26-amd64 8 5.63 23.5G 1.3G 7.8G 25.0M
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8
compute-1-7 lx26-amd64 8 0.00 23.5G 155.7M 7.8G 33.3M
   serial.q             BIP   0/8
   stf.q                BIP   0/8
   all.q                BIP   0/8
compute-1-8 lx26-amd64 8 0.00 23.5G 126.0M 7.8G 29.5M
   serial.q             BIP   0/8
   stf.q                BIP   0/8
   all.q                BIP   0/8
compute-1-9 lx26-amd64 8 8.02 23.5G 1.3G 7.8G 32.7M
   serial.q             BIP   0/8
   stf.q                BIP   8/8
   all.q                BIP   0/8

Please note in particular that node compute-1-1 is idle - none of the eight slots are in use by any of the queues. However, node compute-1-12 has four slots in use, with the other four slots available on the stf.q queue, which is the queue to which the test job will be submitted.

Here is the submittal:

[gladden@stuart ~]$ qsub  -q stf.q submit_test
Your job 523949 ("test") has been submitted

And here is the result:

[gladden@stuart ~]$ qstat -u gladden
job-ID prior name user state submit/start at queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
523949 0.55500 test gladden r 05/17/2011 11:58:18 [email protected] 1

The scheduler picked stf.q@compute-1-1 which was the unloaded node, instead of "packing" the job into one of the four available slots on compute-1-12 as was desired and expected. I should add that stf.q@compute-1-1 is the lowest sequence number instance in stf.q, so this looks like the job was assigned by sequence number rather than by our "-slots" load formula.

Any suggestions? I have poked around in the archive without finding the error of my ways. BTW, why the (-) inversion in the load formula?

Jim


Dave Love wrote:

Thats atypical requirement:

$ qconf -ssconf|grep load
job_load_adjustments              NONE
load_adjustment_decay_time        0:7:30
load_formula                      -slots

See past discussion here and on the old list. Note that there is a bug
which may prevent it working properly with parallel jobs.



_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
<laotsao.vcf>_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to