with allocation rule fillup the scheduler tries to maximize the
amount of slots which can be collected on any host. The host 
selection order depends usually *not* on the amount of free slots
(anyway this could be configured).
It looks like that you have either already some smaller jobs running 
on your host 1 2 or other resource requirements can only be fulfilled
for 54 slots.
If you always want to have 64 slots then you can configure a fixed
allocation rule which is just setting 64 instead of fillup in your pe
config. Renaming your pe with a prefix or postfix mentioning that
it is a 64 slot PE is quite common. Then you could create different PEs
for different slots per host sizes and use a wild card pe selection...

Daniel


Am 09.06.2012 um 03:15 schrieb Joseph Farran <[email protected]>:

> Greetings.
> 
> I am try to setup my MPI Parallel Environment so that whole nodes are used 
> before going to the next node when looking for cores.
> 
> Our nodes have 64 cores.   What I like is that if I ask for 128 cores 
> (slots), one compute node is selected with 64 cores, and then the next one 
> with 64 cores.
> 
> At the suggestion of Prakashan I setup all of my 64 cores nodes with:
> 
>    qconf -rattr exechost complex_values "slots=64" node
> 
> To hopefully tell OGE to use 64-cores per nodes and no more.
> 
> Using a simple Parallel Environment called "mpi" with "$fill_up" allocation 
> rule, I am getting weird results.
> 
> When I ask for 128 cores with:
> 
>    #$ -pe mpi 128
> 
> Some times I get two nodes at 64-core each which is correct:
> 
>   PE_HOSTFILE file: (/var/spool/oge/compute-2-5/active_jobs/83.1/pe_hostfile)
>   compute-2-5.local 64 [email protected] UNDEFINED
>   compute-2-7.local 64 [email protected] UNDEFINED
> 
> 
> Other times, I get sporadic mixture that I can't make sense of, like this one 
> with 3 nodes at 54, 64 and 10 cores:
> 
>   PE_HOSTFILE file: (/var/spool/oge/compute-1-2/active_jobs/84.1/pe_hostfile)
>   compute-1-2.local 54 [email protected] UNDEFINED
>   compute-2-4.local 64 [email protected] UNDEFINED
>   compute-2-6.local 10 [email protected] UNDEFINED
> 
> What setting is causing this and/or load sensor?    What I am looking for is 
> to fill up one 64-core node before it goes on to the next one.   So that if I 
> ask for 128 cores, I will always get 2 whole nodes.
> 
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users

_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to