On Oct 18, 2007, at 9:24 AM, Marcin Skoczylas wrote:
PML add procs failed
--> Returned "Unreachable" (-12) instead of "Success" (0)
----------------------------------------------------------------------
----
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)
Yoinks -- OMPI is determining that it can't use the TCP BTL to reach
other hosts.
/I assume this could be because of:
$ /sbin/route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref
Use
Iface
192.125.17.0 * 255.255.255.0 U 0
0 0 eth1
192.168.12.0 * 255.255.255.0 U 0
0 0 eth1
161.254.0.0 * 255.255.0.0 U 0
0 0 eth1
default 192.125.17.1 0.0.0.0 UG 0
0 0 eth1
192.125 -- is that supposed to be a private address? If so, that's
not really the Right way to do things...
So "narrowly scoped netmasks" which (as it's written in the FAQ)
are not
supported in the OpenMPI. I asked for a workaround on this newsgroup
some time ago - but no answer uptill now. So my question is: what
alternative should I choose that will work in such configuration?
We haven't put in a workaround because (to be blunt) we either forgot
about it and/or not enough people have asked for it. Sorry. :-(
It probably wouldn't be too hard to put in an MCA parameter to say
"don't do netmask comparisons; just assume that every IP address is
reachable by every other IP address."
George -- did you mention that you were working on this at one point?
Do you
have some experience in other MPI implementations, for example LamMPI?
LAM/MPI should be able to work just fine in this environment; it
doesn't do any kind of reachability computations like Open MPI does
-- it blindly assumes that every MPI process is reachable by every
other MPI process.
--
Jeff Squyres
Cisco Systems