also, you can use -mca btl ^sm which, at least for me, actually gives better 
performance than does increasing fifos..

Matt

On Jan 3, 2010, at 10:04 PM, Louis Rossi wrote:

> I am having a problem with BCast hanging on a dual quad core Opteron (2382, 
> 2.6GHz, Quad Core, 4 x 512KB L2, 6MB L3 Cache) system running FC11 with 
> openmpi-1.4.  The LD_LIBRARY_PATH and PATH variables are correctly set.  I 
> have used the FC11 rpm distribution of openmpi and built openmpi-1.4 locally 
> with the same results.  The problem was first observed in a larger reliable 
> CFD code, but I can create the problem with a simple demo code (attached).  
> The code attempts to execute 2000 pairs of broadcasts.
> 
> The hostfile contains a single line
> <machinename> slots=8
> 
> If I run it with 4 cores or fewer, the code will run fine.
> 
> If I run it with 5 cores or more, it will hang some of the time after 
> successfully executing several hundred broadcasts.  The number varies from 
> run to run.  The code usually finishes with 5 cores.  The probability of 
> hanging seems to increase with the number of nodes.  The syntax I use is 
> simple.
> 
> mpiexec -machinefile hostfile -np 5 bcast_example
> 
> There was some discussion of a similar problem on the user list, but I could 
> not find a resolution.  I have tried setting the processor affinity (--mca 
> mpi_paffinity_alone 1).  I have tried varying the broadcast algorithm (--mca 
> coll_tuned_bcast_algorithm 1-6).  I have also tried excluding (-mca 
> oob_tcp_if_exclude) my eth1 interface (see ifconfig.txt attached) which is 
> not connected to anything.  None of these changed the outcome.
> 
> Any thoughts or suggestions would be appreciated.
> 
> -- 
> "Through nonaction, no action is left undone." --Lao Tzu
> 
> Louis F. Rossi                                ro...@math.udel.edu
> Department of Mathematical Sciences   http://www.math.udel.edu/~rossi
> University of Delaware                        (302) 831-1880 (voice)
> Newark, DE 19716                      (302) 831-4511 (fax)
> 
> <bcast_example.c.gz><ompi_info.txt.gz><ifconfig.txt.gz>_______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

_________________________________
Matthew MacManes
PhD Candidate
University of California- Berkeley
Museum of Vertebrate Zoology
Phone: 510-495-5833
Lab Website: http://ib.berkeley.edu/labs/lacey
Personal Website: http://macmanes.com/


Reply via email to