Am 10.04.2014 um 23:51 schrieb Michael Coffman:

> I am trying to setup a PE and am struggling to understand how grid determines 
> how many slots are available for the PE.   I have set up 3 test machines in a 
> queue.  I set the default slots to 10.  Each system is actually a virtual 
> machine that has one cpu and ~2G of memory.    PE definition is:
> 
> pe_name            dp
> slots              999
> user_lists         NONE
> xuser_lists        NONE
> start_proc_args    /bin/true
> stop_proc_args     /bin/true
> allocation_rule    $fill_up
> control_slaves     FALSE
> job_is_first_task  TRUE
> urgency_slots      min
> accounting_summary FALSE
> 
> Since I have 10 slots per host, I assumed I would have 30 slots.   And when 
> testing I get:
> 
> $qrsh -w v -q all.q  -now no -pe dp 30
> verification: found possible assignment with 30 slots
> 
> $qrsh -w p -q all.q  -now no -pe dp 30
> verification: found possible assignment with 30 slots
> 
> But when I actually try to run the job the following from qstat... 
>  
> cannot run in PE "dp" because it only offers 12 slots
> 
> I get that other resources can impact the availablity of slots, but I'm 
> having a hard time figuring out why I'm only getting 12 slots and what 
> resources are impacting this...
> 
> When I request -pd dp 12, it works fine and distributes the jobs across all 
> three systems...
> 
> 717 0.65000 QRLOGIN    user      r    04/10/2014 14:40:14 all.q@gridtst1 SLAVE
>                                                            all.q@gridtst1 
> SLAVE
>                                                            all.q@gridtst1 
> SLAVE
>                                                            all.q@gridtst1 
> SLAVE
> 9717 0.65000 QRLOGIN    user      r    04/10/2014 14:40:14 all.q@gridtst2 
> SLAVE
>                                                            all.q@gridtst2 
> SLAVE
>                                                            all.q@gridtst2 
> SLAVE
>                                                            all.q@gridtst2 
> SLAVE
> 9717 0.65000 QRLOGIN    user      r    04/10/2014 14:40:14 all.q@gridtst3 
> MASTER
>                                                            all.q@gridtst3 
> SLAVE
>                                                            all.q@gridtst3 
> SLAVE
>                                                            all.q@gridtst3 
> SLAVE

What's the output of: qstat -f

Did you setup any consumable like memory on the nodes with a default 
consumption?

- Reuti


> I'm assuming I am missing something simple :(    What should I be looking at 
> to help me better understand what's going on?    I do notice that hl:cpu 
> jumps significantly between idle, dp 12 and dp 24, but I did find anything in 
> the docs describing what cpu represents...
> 
> Any help or pointers would be greatly appreciated...
> 
> I'm running a very old version of grid, but assume that shouldn't matter (SGE 
> 6.2u5)
> -- 
> -MichaelC
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to