Hi folks!

I am trying to launch *MPI master branch* with srun (simple send/recv
program, see attach) and using *openib*, but unfortunately I get a *segfault
*.

Below is my workflow.
1) I configured ompi/master with following line:

./autogen.sh &&  ./configure --prefix=$PWD/install --with-openib --with-pmi
&& make -j3 && make install -j3

2) exported (along with PATH and LD_LIBRARY_PATH) OMPI_MCA_btl variable:

export OMPI_MCA_btl=self,openib

3) and launched with following line:

mpicc ~/usefull_tests/mpi_init.c && srun -n 2 ./a.out


Eventually I get following error:

srun: error: mir6: task 1: Segmentation fault (core dumped)
srun: Terminating job step 17309.2


with following backtrace:

#0  0x00007f856c47b1d0 in ?? ()
#1  <signal handler called>
#2  0x00007f856d12d721 in rml_recv_cb (status=0, process_name=0x2027c50,
buffer=0x7f857084ed10,
    tag=102, cbdata=0x0) at connect/btl_openib_connect_oob.c:823
#3  0x00007f857553ffb0 in orte_rml_base_process_msg (fd=-1, flags=4,
cbdata=0x2027b80)
    at base/rml_base_msg_handlers.c:172
#4  0x00007f857522a6c6 in event_process_active_single_queue
(base=0x1ed6c60, activeq=0x1ec9210)
    at event.c:1367
#5  0x00007f857522a93e in event_process_active (base=0x1ed6c60) at
event.c:1437
#6  0x00007f857522afbc in opal_libevent2021_event_base_loop
(base=0x1ed6c60, flags=1) at event.c:1645
#7  0x00007f85754ccc19 in orte_progress_thread_engine (obj=0x7f857577cf20)
at runtime/orte_init.c:180
#8  0x0000003b5a6077f1 in start_thread () from /lib64/libpthread.so.0
#9  0x0000003b59ee570d in clone () from /lib64/libc.so.6



Can anybody please help with a reason of such failure?

P.s. I use  Red Hat Enterprise Linux Server release 6.2 with InfiniBand
cards.

Thanks in advance,
Victor Kocheganov.
#include "mpi.h"   /* PROVIDES THE BASIC MPI DEFINITION AND TYPES */
#include "stdio.h"

int main(int argc, char **argv) {

    int rank, size, i;
    int buffer[10];
    MPI_Status status;

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    if (size < 2)
    {
        printf("Please run with two processes.\n");fflush(stdout);
        MPI_Finalize();
        return 0;
    }
    if (rank == 0)
    {
        for (i=0; i<10; i++)
            buffer[i] = i;
        MPI_Send(buffer, 10, MPI_INT, 1, 123, MPI_COMM_WORLD);
    }
    if (rank == 1)
    {
        for (i=0; i<10; i++)
            buffer[i] = -1;
        MPI_Recv(buffer, 10, MPI_INT, 0, 123, MPI_COMM_WORLD, &status);
        for (i=0; i<10; i++)
        {
            if (buffer[i] != i)
                printf("Error: buffer[%d] = %d but is expected to be %d\n", i, buffer[i], i);
        }
        fflush(stdout);
    }
    MPI_Finalize();
    return 0;
}

Reply via email to