[OMPI devel] Loadbalancing

2008-04-23 Thread Ralph H Castain
I added a new "loadbalance" feature to OMPI today in r18252.

Brief summary: adding --loadbalance to the mpirun cmd line will cause the
round-robin mapper to balance your specified #procs across the available
nodes.

More detail:
Several users had noted that mapping byslot always caused us to
preferentially load the first nodes in an allocation, potentially leaving
other nodes unused. If they mapped bynode, of course, this wouldn't happen -
but then they were forced to a specific rank-to-node relationship.

What they wanted was to have the ranks numbered byslot, but to have the ppn
balanced across the entire allocation.

This is now supported via the --loadbalance cmd line option. Here is an
example of its affect (again, remember that loadbalance only impacts mapping
byslot):

 no-lb  lb bynode
node0:  0,1,2,30,1,2   0,3,6
node1:  4,5,6  3,4 1,4
node2: 5,6 2,5


As you can see, the affect of --loadbalance is to balance the ppn across all
the available nodes while retaining byslot rank associations. In this case,
instead of leaving one node unused, we take advantage of all available
resources.

Hope this proves helpful
Ralph




Re: [OMPI devel] Communication problem

2008-04-23 Thread Jeff Squyres
I'm surprised that it takes "a minute" to fail to find IB -- usually  
the search and failure is more-or-less instantaneous.


Can you send all the information listed here?

http://www.open-mpi.org/community/help/



On Apr 23, 2008, at 5:55 AM, Ziv Mhabary wrote:


Hi,
When im tring to run a code, its first looking for Infiniband
communication,i dont have infiniband in my cluster, so after a minute
its start to look for the ethernet and then its work.
how can i change my default communication?
i want it to search for the ethernet first.
Thanks!
Ziv.
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



[OMPI devel] Merging in the CPC work

2008-04-23 Thread Jeff Squyres
As we discussed yesterday, I have started the merge from the /tmp- 
public/openib-cpc2 branch.  "oob" is currently the default.


Unfortunately, it caused quite a few conflicts when I merged with the  
trunk, so I created a new temp branch and put all the work there: /tmp- 
public/openib-cpc3.


Could all the IB and iWARP vendors and any other interested parties  
please try this branch before we bring it back to the trunk?  Please  
test all functionality that you care about -- XRC, etc.  I'd like to  
bring it back to the trunk COB Thursday.  Please let me know if this  
is too soon.


You can force the selection of a different CPC with the  
btl_openib_cpc_include MCA param:


mpirun --mca btl_openib_cpc_include oob ...
mpirun --mca btl_openib_cpc_include xoob ...
mpirun --mca btl_openib_cpc_include rdma_cm ...
mpirun --mca btl_openib_cpc_include ibcm ...

You might want to concentrate on testing oob and xoob to ensure that  
we didn't cause any regressions.  The ibcm and rdma_cm CPCs probably  
still have some rough edges (and the IBCM package in OFED itself may  
not be 100% -- that's one of the things we're evaluating.  It's known  
to not install properly on RHEL4U4, for example -- you have to  
manually mknod and chmod a device in /dev/infiniband for every HCA in  
the host).


Thanks.

--
Jeff Squyres
Cisco Systems



[OMPI devel] Communication problem

2008-04-23 Thread Ziv Mhabary
Hi,
When im tring to run a code, its first looking for Infiniband
communication,i dont have infiniband in my cluster, so after a minute
its start to look for the ethernet and then its work.
how can i change my default communication?
i want it to search for the ethernet first.
Thanks!
Ziv.