Hi folks! I am trying to launch *MPI master branch* with srun (simple send/recv program, see attach) and using *openib*, but unfortunately I get a *segfault *.
Below is my workflow. 1) I configured ompi/master with following line: ./autogen.sh && ./configure --prefix=$PWD/install --with-openib --with-pmi && make -j3 && make install -j3 2) exported (along with PATH and LD_LIBRARY_PATH) OMPI_MCA_btl variable: export OMPI_MCA_btl=self,openib 3) and launched with following line: mpicc ~/usefull_tests/mpi_init.c && srun -n 2 ./a.out Eventually I get following error: srun: error: mir6: task 1: Segmentation fault (core dumped) srun: Terminating job step 17309.2 with following backtrace: #0 0x00007f856c47b1d0 in ?? () #1 <signal handler called> #2 0x00007f856d12d721 in rml_recv_cb (status=0, process_name=0x2027c50, buffer=0x7f857084ed10, tag=102, cbdata=0x0) at connect/btl_openib_connect_oob.c:823 #3 0x00007f857553ffb0 in orte_rml_base_process_msg (fd=-1, flags=4, cbdata=0x2027b80) at base/rml_base_msg_handlers.c:172 #4 0x00007f857522a6c6 in event_process_active_single_queue (base=0x1ed6c60, activeq=0x1ec9210) at event.c:1367 #5 0x00007f857522a93e in event_process_active (base=0x1ed6c60) at event.c:1437 #6 0x00007f857522afbc in opal_libevent2021_event_base_loop (base=0x1ed6c60, flags=1) at event.c:1645 #7 0x00007f85754ccc19 in orte_progress_thread_engine (obj=0x7f857577cf20) at runtime/orte_init.c:180 #8 0x0000003b5a6077f1 in start_thread () from /lib64/libpthread.so.0 #9 0x0000003b59ee570d in clone () from /lib64/libc.so.6 Can anybody please help with a reason of such failure? P.s. I use Red Hat Enterprise Linux Server release 6.2 with InfiniBand cards. Thanks in advance, Victor Kocheganov.
#include "mpi.h" /* PROVIDES THE BASIC MPI DEFINITION AND TYPES */ #include "stdio.h" int main(int argc, char **argv) { int rank, size, i; int buffer[10]; MPI_Status status; MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (size < 2) { printf("Please run with two processes.\n");fflush(stdout); MPI_Finalize(); return 0; } if (rank == 0) { for (i=0; i<10; i++) buffer[i] = i; MPI_Send(buffer, 10, MPI_INT, 1, 123, MPI_COMM_WORLD); } if (rank == 1) { for (i=0; i<10; i++) buffer[i] = -1; MPI_Recv(buffer, 10, MPI_INT, 0, 123, MPI_COMM_WORLD, &status); for (i=0; i<10; i++) { if (buffer[i] != i) printf("Error: buffer[%d] = %d but is expected to be %d\n", i, buffer[i], i); } fflush(stdout); } MPI_Finalize(); return 0; }