I am trying to setup a PE and am struggling to understand how grid
determines how many slots are available for the PE. I have set up 3 test
machines in a queue. I set the default slots to 10. Each system is
actually a virtual machine that has one cpu and ~2G of memory. PE
definition is:
pe_name dp
slots 999
user_lists NONE
xuser_lists NONE
start_proc_args /bin/true
stop_proc_args /bin/true
allocation_rule $fill_up
control_slaves FALSE
job_is_first_task TRUE
urgency_slots min
accounting_summary FALSE
Since I have 10 slots per host, I assumed I would have 30 slots. And when
testing I get:
$qrsh -w v -q all.q -now no -pe dp 30
verification: found possible assignment with 30 slots
$qrsh -w p -q all.q -now no -pe dp 30
verification: found possible assignment with 30 slots
But when I actually try to run the job the following from qstat...
cannot run in PE "dp" because it only offers 12 slots
I get that other resources can impact the availablity of slots, but I'm
having a hard time figuring out why I'm only getting 12 slots and what
resources are impacting this...
When I request -pd dp 12, it works fine and distributes the jobs across all
three systems...
717 0.65000 QRLOGIN user r 04/10/2014 14:40:14 all.q@gridtst1SLAVE
all.q@gridtst1SLAVE
all.q@gridtst1SLAVE
all.q@gridtst1SLAVE
9717 0.65000 QRLOGIN user r 04/10/2014 14:40:14 all.q@gridtst2SLAVE
all.q@gridtst2SLAVE
all.q@gridtst2SLAVE
all.q@gridtst2SLAVE
9717 0.65000 QRLOGIN user r 04/10/2014 14:40:14 all.q@gridtst3MASTER
all.q@gridtst3SLAVE
all.q@gridtst3SLAVE
all.q@gridtst3SLAVE
I'm assuming I am missing something simple :( What should I be looking
at to help me better understand what's going on? I do notice that hl:cpu
jumps significantly between idle, dp 12 and dp 24, but I did find anything
in the docs describing what cpu represents...
Any help or pointers would be greatly appreciated...
I'm running a very old version of grid, but assume that shouldn't matter
(SGE 6.2u5)
--
-MichaelC
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users