Hi,
I have already seen this faq. Nodes in cluster does not have multiple
IP addresses. One thing i forgot to mention is that systems in cluster does
not have static IPs and get IP address through DHCP.
Also if there is a print statement (printf("hello world\n"); ) in slave it
is correctly printed on masters consoles but none of MPI commands work.
regards,
Abhishek
I need to make that error string be google-able -- I'll add it to the
faq. :-)
The problem is likely that you have multiple IP addresses, some of
which are not routable to each other (but fail OMPI's routability
assumptions). Check out these FAQ entries:
http://www.open-mpi.org/faq/?category=tcp#tcp-routability<http://www.open-mpi.org/faq/?category=tcp#tcp-routability>
http://www.open-mpi.org/faq/?category=tcp#tcp-selection<http://www.open-mpi.org/faq/?category=tcp#tcp-selection>
Does this help?
On Apr 19, 2007, at 11:07 AM, Babu Bhai wrote:
I have migrated from LAM/MPI to OpenMPI. I am not able to
execute simple mpi code in which master sends an integer to slave.
If i execute code on single machine i.e start 2 instance on same
machine (mpirun -np 2 hello) this works fine.
If i execute in cluster using mpirun --prefix /usr /local -
np 2 --host 199.63.34.154,199.63.34.36 hello
it gives following error "btl_tcp_endpoint.c:
572:mca_btl_tcp_endpoint_complete_connect] connect() failed with
errno=113"
>I am using openmpi-1.2
>regards,
>Abhishek
>_______________________________________________
>users mailing list
>users_at_[hidden]
>http://www.open-mpi.org/mailman/listinfo.cgi/users<http://www.open-mpi.org/mailman/listinfo.cgi/users>
--
Jeff Squyres
Cisco Systems