Hi,

I came across this. openmpi-4.0.1 compiled with:

../openmpi-4.0.1/configure --disable-mpi-fortran --without-cuda
--disable-opencl --with-ucx=/path/to/ucx-1.5.1

The execution of the attached program (simple mpi_send / mpi_recv pair)
gives a segfault when the message size exceeds 2^30. I'm seeing the failure
on debian10 nodes connected with 1G ethernet and mellanox IB FDR
(ConnectX-3). On another cluster with omnipath interconnect, the test
passes fine. Both have ipoib configured.

node0 ~ $ mpiexec -machinefile /tmp/hosts -n 2 --mca btl tcp,self --mca mtl
ofi --mca pml ^ucx  ./a.out 1200000000

Maybe this btl/pml/mtl combination is nonsensical, I don't know. What
annoys me is that the following failure:
 1 - occurs only for large messages, not for smaller test runs
 2 - is not recoverable via MPI_ERRORS_RETURN

Output:

[node0:9791 :0:9791] Caught signal 11 (Segmentation fault: address not
mapped to object at address (nil))
==== backtrace ====
    0  /path/to/ucx-1.5.1/lib/libucs.so.0(+0x1dee0) [0x7f21e2b01ee0]
    1  /path/to/ucx-1.5.1/lib/libucs.so.0(+0x1e188) [0x7f21e2b02188]
===================
--------------------------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpiexec noticed that process rank 1 with PID 0 on node node0 exited on
signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Running this under gdb, it seems that the backtrace just points to the ucs
signal handler, and that the cause of the segv is there
(ompi/mca/mtl/ofi/mtl_ofi.h:107) :
        } else if (OPAL_UNLIKELY(ret == -FI_EAVAIL)) {
            /**
             * An error occured and is being reported via the CQ.
             * Read the error and forward it to the upper layer.
             */
            [...]
            ret = ofi_req->error_callback(&error, ofi_req);

with ofi_req->error_callback being unfortunately NULL.

Is it really just me doing something absolutely silly, or is it something
that ought to be fixed ?

Best,

E.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <mpi.h>

int main(int argc, char * argv[])
{
    size_t chunk = 3<<29;
    if (argc > 1)
        chunk = atol(argv[1]);

    int rank;
    int size;

    MPI_Init(&argc, &argv);
    MPI_Comm_set_errhandler(MPI_COMM_WORLD, MPI_ERRORS_RETURN);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    void * data = malloc(chunk);
    memset(data, 0x42, chunk);
    if (rank == 0) {
        MPI_Send(data, chunk, MPI_BYTE, 1, 0xbeef, MPI_COMM_WORLD);
    } else {
        MPI_Recv(data, chunk, MPI_BYTE, 0, 0xbeef, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
    }
    MPI_Barrier(MPI_COMM_WORLD);
    printf("ok\n");
    MPI_Finalize();
}

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to