Hello list, I'm having difficulty with running a simple hello world OpenMPI program over Myrinet gm interconnect - please see the log at the end of this email. The error is tripped by a call to the function gm_global_id_to_node_id( gm_btl->port, gm_endpoint->endpoint_addr.global_id, &gm_endpoint->endpoint_addr.node_id)) My hardware setup is identical to the one described here: http://www.open-mpi.org/community/lists/users/2007/02/2577.php and I'm using the latest stable release - OpenMPI 1.1.4. Has anybody encountered this error before? Google returns nothing on it...
Thanks, Alex. P.S. Note that the hello-world job does run despite the error, but HPLinpack benchmark does fail. Hello World LOG: # mpirun -np 4 --prefix $MPIHOME -H c0-0,f0-0.local --mca btl gm,self --mca btl_tcp_if_exclude eth1 ~/testdir/hello [f0-0:25256] [btl_gm_proc.c:184] error in converting global to local id [f0-0:25256] [btl_gm_proc.c:184] error in converting global to local id [compute-0-0.local:31918] [btl_gm_proc.c:184] error in converting global to local id [f0-0:25257] [btl_gm_proc.c:184] error in converting global to local id [f0-0:25257] [btl_gm_proc.c:184] error in converting global to local id [compute-0-0.local:31919] [btl_gm_proc.c:184] error in converting global to local id [compute-0-0.local:31919] [btl_gm_proc.c:184] error in converting global to local id [compute-0-0.local:31918] [btl_gm_proc.c:184] error in converting global to local id Hello from Alex' MPI test program Process 1 on f0-0 out of 4 Hello from Alex' MPI test program Hello from Alex' MPI test program Process 2 on compute-0-0.local out of 4 Process 0 on compute-0-0.local out of 4 Hello from Alex' MPI test program Process 3 on f0-0 out of 4 HPLinpack LOG: # mpirun -np 4 --prefix $MPIHOME -H c0-0,f0-0.local --mca btl gm,self /opt/hpl/openmpi-hpl/bin/xhpl [f0-0:25443] [btl_gm_proc.c:184] error in converting global to local id [compute-0-0.local:32595] [btl_gm_proc.c:184] error in converting global to local id [compute-0-0.local:32595] [btl_gm_proc.c:184] error in converting global to local id [f0-0:25444] [btl_gm_proc.c:184] error in converting global to local id [f0-0:25444] [btl_gm_proc.c:184] error in converting global to local id [compute-0-0.local:32596] [btl_gm_proc.c:184] error in converting global to local id [compute-0-0.local:32596] [btl_gm_proc.c:184] error in converting global to local id [f0-0:25443] [btl_gm_proc.c:184] error in converting global to local id [f0-0:25443] *** An error occurred in MPI_Send [f0-0:25443] *** on communicator MPI_COMM_WORLD [f0-0:25443] *** MPI_ERR_INTERN: internal error [f0-0:25443] *** MPI_ERRORS_ARE_FATAL (goodbye) [f0-0:25444] *** An error occurred in MPI_Send [f0-0:25444] *** on communicator MPI_COMM_WORLD [f0-0:25444] *** MPI_ERR_INTERN: internal error [f0-0:25444] *** MPI_ERRORS_ARE_FATAL (goodbye) mpirun noticed that job rank 0 with PID 32595 on node "c0-0" exited on signal 15. 3 additional processes aborted (not shown)