My test wasn't a benchmark - I was just testing with a little program that calls mpi_init, mpi_barrier, and mpi_finalize.

A test with just mpi_init/finalize works fine, so it looks like we simply hang when trying to communicate. This also only happens on multi-node operations.

On Jul 28, 2008, at 10:16 AM, Jeff Squyres wrote:

On Jul 28, 2008, at 12:03 PM, George Bosilca wrote:

Interesting. The self is only used for local communications. I don't expect that any benchmark execute such communications, but apparently I was wrong. Please let me know the failing test, I will take a look this evening.

FWIW, my manual tests of a simplistic "ring" program work for all combinations (openib, openib+self, openib+self+sm). Shrug.

But for OSU latency, I found that openib, openib+sm work, but openib +sm+self hangs (same results whether the 2 procs are on the same node or different nodes). There is no self communication in osu_latency, so something else must be going on.

--
Jeff Squyres
Cisco Systems

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to