There are 2 different collectives in Open MPI. One is a basic implementation and one is highly optimized. The only problem is that we optimized them based on the network, number of nodes and message size. As you can imagine ... not all the networks are the same ... which lead to troubles on some platforms. We're working on a tool to automatically find out the best collective algorithms for your particular configuration.

No precise timeline yet. ASAP is the keyword here, but of course we are 3 weeks away from SC so I don't think that much can happens before SC. Anyway, if you want I can add you to the list of volunteers for testing it :)

  Thanks,
    george.

On Oct 26, 2006, at 6:03 PM, Tony Ladd wrote:


1) I think OpenMPI does not use optimal algorithms for collectives. But neither does LAM. For example the MPI_Allreduce scales as log_2 N where N is
the number of processors. MPICH uses optimized collectives and the
MPI_Allreduce is essentially independent of N. Unfortunately MPICH has never
had a good TCP interface so its typically slower overall than LAM or
OpenMPI. Are there plans to develop optimized collectives for OpenMPI; if
so, is there a timeline

2) I have found an additional problem in OpenMPI over TCP. MPI_AllReduce can run extremely slowly on large numbers of processors. Measuring throughput (message size / time) for 48 nodes with 16KByte messages (for example) I get only 0.12MBytes/sec. The same code with LAM gets 5.3MBytes/sec which is more reasonable. The problem seems to arise for a) more than 16 nodes and b) message sizes in the range 16-32KBytes. Normally this is the optimum size so its odd. Other message sizes are closer to LAM (though typically a little slower). I have run these tests with my own network test, but I can run IMB
if necessary.

Tony


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to