Thanks for the respons!
$ qstat -f
queuename qtype resv/used/tot. load_avg arch
states
---------------------------------------------------------------------------------
fast.q@gridtst1 BIP 0/0/8 0.00 lx24-amd64
---------------------------------------------------------------------------------
fast.q@gridtst2 BIP 0/0/8 0.00 lx24-amd64
---------------------------------------------------------------------------------
fast.q@gridtst3 BIP 0/0/8 0.00 lx24-amd64
---------------------------------------------------------------------------------
all.q@gridtst1 BIP 0/0/10 0.00 lx24-amd64
---------------------------------------------------------------------------------
all.q@gridtst2 BIP 0/0/10 0.00 lx24-amd64
---------------------------------------------------------------------------------
all.q@gridtst3 BIP 0/0/10 0.00 lx24-amd64
I have some custom complexes, but set them to 0 for the defaults for
testing. This command
seemed like the best way to see what was being submitted with default
values..
$ qconf -sc | awk '$7~/1/ {print}'
slots s INT <= YES
YES 1 1000
On Thu, Apr 10, 2014 at 4:08 PM, Reuti <[email protected]> wrote:
> Am 10.04.2014 um 23:51 schrieb Michael Coffman:
>
> > I am trying to setup a PE and am struggling to understand how grid
> determines how many slots are available for the PE. I have set up 3 test
> machines in a queue. I set the default slots to 10. Each system is
> actually a virtual machine that has one cpu and ~2G of memory. PE
> definition is:
> >
> > pe_name dp
> > slots 999
> > user_lists NONE
> > xuser_lists NONE
> > start_proc_args /bin/true
> > stop_proc_args /bin/true
> > allocation_rule $fill_up
> > control_slaves FALSE
> > job_is_first_task TRUE
> > urgency_slots min
> > accounting_summary FALSE
> >
> > Since I have 10 slots per host, I assumed I would have 30 slots. And
> when testing I get:
> >
> > $qrsh -w v -q all.q -now no -pe dp 30
> > verification: found possible assignment with 30 slots
> >
> > $qrsh -w p -q all.q -now no -pe dp 30
> > verification: found possible assignment with 30 slots
> >
> > But when I actually try to run the job the following from qstat...
> >
> > cannot run in PE "dp" because it only offers 12 slots
> >
> > I get that other resources can impact the availablity of slots, but I'm
> having a hard time figuring out why I'm only getting 12 slots and what
> resources are impacting this...
> >
> > When I request -pd dp 12, it works fine and distributes the jobs across
> all three systems...
> >
> > 717 0.65000 QRLOGIN user r 04/10/2014 14:40:14
> > all.q@gridtst1SLAVE
> >
> > all.q@gridtst1SLAVE
> >
> > all.q@gridtst1SLAVE
> >
> > all.q@gridtst1SLAVE
> > 9717 0.65000 QRLOGIN user r 04/10/2014 14:40:14
> all.q@gridtst2 SLAVE
> >
> > all.q@gridtst2SLAVE
> >
> > all.q@gridtst2SLAVE
> >
> > all.q@gridtst2SLAVE
> > 9717 0.65000 QRLOGIN user r 04/10/2014 14:40:14
> all.q@gridtst3 MASTER
> >
> > all.q@gridtst3SLAVE
> >
> > all.q@gridtst3SLAVE
> >
> > all.q@gridtst3SLAVE
>
> What's the output of: qstat -f
>
> Did you setup any consumable like memory on the nodes with a default
> consumption?
>
> - Reuti
>
>
> > I'm assuming I am missing something simple :( What should I be
> looking at to help me better understand what's going on? I do notice
> that hl:cpu jumps significantly between idle, dp 12 and dp 24, but I did
> find anything in the docs describing what cpu represents...
> >
> > Any help or pointers would be greatly appreciated...
> >
> > I'm running a very old version of grid, but assume that shouldn't matter
> (SGE 6.2u5)
> > --
> > -MichaelC
> > _______________________________________________
> > users mailing list
> > [email protected]
> > https://gridengine.org/mailman/listinfo/users
>
>
--
-MichaelC
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users