Dave,
Thank you for your reply. I have setup the scheduler as you have
suggested, but I still do not get the desired result.
Here's my scheduler configuration:
[gladden@stuart ~]$ qconf -ssconf|grep load
queue_sort_method load
job_load_adjustments NONE
load_adjustment_decay_time 0:7:30
load_formula -slots
And here is the state of the hosts prior to doing a test job
submission:
[gladden@stuart ~]$ qhost -q
HOSTNAME ARCH NCPU LOAD MEMTOT MEMUSE
SWAPTO SWAPUS
-------------------------------------------------------------------------------
global - - - -
- - -
compute-1-1 lx26-amd64 8 0.00 23.5G 77.9M
7.8G 33.4M
serial.q BIP 0/8
stf.q BIP 0/8
all.q BIP 0/8
compute-1-10 lx26-amd64 8 8.02 23.5G 15.5G
7.8G 32.9M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-11 lx26-amd64 8 8.52 23.5G 19.7G
7.8G 5.7G
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-12 lx26-amd64 8 4.00 23.5G 1.1G
7.8G 24.0M
serial.q BIP 0/8
stf.q BIP 4/8
all.q BIP 0/8
compute-1-13 lx26-amd64 8 7.00 23.5G 22.2G
7.8G 30.8M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-14 lx26-amd64 8 7.11 23.5G 22.2G
7.8G 216.7M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-15 lx26-amd64 8 8.35 23.5G 21.1G
7.8G 5.7G
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-16 lx26-amd64 8 7.38 23.5G 23.2G
7.8G 32.4M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-17 lx26-amd64 8 7.38 23.5G 23.2G
7.8G 32.3M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-18 lx26-amd64 8 0.00 23.5G 144.0M
7.8G 18.2M
all.q BIP 0/8
turecek.q BIP 0/8
compute-1-19 lx26-amd64 8 0.00 23.5G 359.8M
7.8G 11.5M
all.q BIP 0/8
turecek.q BIP 0/8
compute-1-2 lx26-amd64 8 8.03 23.5G 561.4M
7.8G 29.3M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-20 lx26-amd64 8 0.00 23.5G 383.3M
7.8G 14.0M
all.q BIP 0/8
turecek.q BIP 0/8
compute-1-21 lx26-amd64 8 8.00 23.5G 13.0G
7.8G 22.9M
robinson.q BIP 8/8
all.q BIP 0/8
compute-1-22 lx26-amd64 8 1.00 23.5G 288.9M
7.8G 30.4M
robinson.q BIP 1/8
all.q BIP 0/8
compute-1-23 lx26-amd64 8 8.00 23.5G 13.2G
7.8G 35.2M
robinson.q BIP 8/8
all.q BIP 0/8
compute-1-24 lx26-amd64 8 0.00 23.5G 146.5M
7.8G 25.0M
khalil.q BIP 0/8
all.q BIP 0/8
compute-1-3 lx26-amd64 8 8.05 23.5G 742.0M
7.8G 32.9M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-4 lx26-amd64 8 0.00 23.5G 104.2M
7.8G 31.6M
serial.q BIP 0/8
stf.q BIP 0/8
all.q BIP 0/8
compute-1-5 lx26-amd64 8 8.03 23.5G 8.4G
7.8G 23.4M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-6 lx26-amd64 8 5.63 23.5G 1.3G
7.8G 25.0M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
compute-1-7 lx26-amd64 8 0.00 23.5G 155.7M
7.8G 33.3M
serial.q BIP 0/8
stf.q BIP 0/8
all.q BIP 0/8
compute-1-8 lx26-amd64 8 0.00 23.5G 126.0M
7.8G 29.5M
serial.q BIP 0/8
stf.q BIP 0/8
all.q BIP 0/8
compute-1-9 lx26-amd64 8 8.02 23.5G 1.3G
7.8G 32.7M
serial.q BIP 0/8
stf.q BIP 8/8
all.q BIP 0/8
Please note in particular that node compute-1-1 is idle - none of
the eight slots are in use by any of the queues. However, node
compute-1-12 has four slots in use, with the other four slots
available on the stf.q queue, which is the queue to which the test
job will be submitted.
Here is the submittal:
[gladden@stuart ~]$ qsub -q stf.q submit_test
Your job 523949 ("test") has been submitted
And here is the result:
[gladden@stuart ~]$ qstat -u gladden
job-ID prior name user state submit/start at
queue slots ja-task-ID
-----------------------------------------------------------------------------------------------------------------
523949 0.55500 test gladden r 05/17/2011 11:58:18 [email protected]
1
The scheduler picked stf.q@compute-1-1 which was the unloaded node,
instead of "packing" the job into one of the four available slots
on compute-1-12 as was desired and expected. I should add that
stf.q@compute-1-1 is the lowest sequence number instance in stf.q,
so this looks like the job was assigned by sequence number rather
than by our "-slots" load formula.
Any suggestions? I have poked around in the archive without
finding the error of my ways. BTW, why the (-) inversion in the
load formula?
Jim
Dave Love wrote:
Thats atypical requirement:
$ qconf -ssconf|grep load
job_load_adjustments NONE
load_adjustment_decay_time 0:7:30
load_formula -slots
See past discussion here and on the old list. Note that there is
a bug
which may prevent it working properly with parallel jobs.
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users