Are you running any firewall software? Sent from my phone. No type good.
On May 25, 2011, at 10:41 PM, "Jagannath Mondal" <jagannath.mon...@gmail.com> wrote: > Hi, > I am having a problem in running mpirun over multiple nodes. > To run a job over two 8-core processors, I generated a hostfile as follows: > yethiraj30 slots=8 max_slots=8 > yethiraj31 slots=8 max_slots=8 > > These two machines are intra-connected and I have installed openmpi 1.3.3. > Then If I try to run the replica exchange simulation using the following > command: > mpirun -np 16 --hostfile hostfile mdrun_4mpi -s topol_.tpr -multi 16 > -replex 100 >& log_replica_test > > But I find following error and job does not proceed at all : > btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect] connect() to > 192.168.0.31 failed: No route to host (113) > > Here is the full details: > > NNODES=16, MYRANK=0, HOSTNAME=yethiraj30 > NNODES=16, MYRANK=1, HOSTNAME=yethiraj30 > NNODES=16, MYRANK=4, HOSTNAME=yethiraj30 > NNODES=16, MYRANK=2, HOSTNAME=yethiraj30 > NNODES=16, MYRANK=6, HOSTNAME=yethiraj30 > NNODES=16, MYRANK=3, HOSTNAME=yethiraj30 > NNODES=16, MYRANK=5, HOSTNAME=yethiraj30 > NNODES=16, MYRANK=7, HOSTNAME=yethiraj30 > [yethiraj30][[22604,1],0][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.0.31 failed: No route to host (113) > [yethiraj30][[22604,1],4][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.0.31 failed: No route to host (113) > [yethiraj30][[22604,1],6][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.0.31 failed: No route to host (113) > [yethiraj30][[22604,1],1][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.0.31 failed: No route to host (113) > [yethiraj30][[22604,1],3][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.0.31 failed: No route to host (113) > [yethiraj30][[22604,1],2][btl_tcp_endpoint.c:636:mca_btl_tcp_endpoint_complete_connect] > connect() to 192.168.0.31 failed: No route to host (113) > NNODES=16, MYRANK=10, HOSTNAME=yethiraj31 > NNODES=16, MYRANK=12, HOSTNAME=yethiraj31 > > I am not sure how to resolve this issue. In general, I can go from one > machine to another without any problem using ssh. But, when I am trying to > run openmpi over both the machines, I get this error. Any help will be > appreciated. > > Jagannath > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users