Hi, I came across this. openmpi-4.0.1 compiled with:
../openmpi-4.0.1/configure --disable-mpi-fortran --without-cuda --disable-opencl --with-ucx=/path/to/ucx-1.5.1 The execution of the attached program (simple mpi_send / mpi_recv pair) gives a segfault when the message size exceeds 2^30. I'm seeing the failure on debian10 nodes connected with 1G ethernet and mellanox IB FDR (ConnectX-3). On another cluster with omnipath interconnect, the test passes fine. Both have ipoib configured. node0 ~ $ mpiexec -machinefile /tmp/hosts -n 2 --mca btl tcp,self --mca mtl ofi --mca pml ^ucx ./a.out 1200000000 Maybe this btl/pml/mtl combination is nonsensical, I don't know. What annoys me is that the following failure: 1 - occurs only for large messages, not for smaller test runs 2 - is not recoverable via MPI_ERRORS_RETURN Output: [node0:9791 :0:9791] Caught signal 11 (Segmentation fault: address not mapped to object at address (nil)) ==== backtrace ==== 0 /path/to/ucx-1.5.1/lib/libucs.so.0(+0x1dee0) [0x7f21e2b01ee0] 1 /path/to/ucx-1.5.1/lib/libucs.so.0(+0x1e188) [0x7f21e2b02188] =================== -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpiexec noticed that process rank 1 with PID 0 on node node0 exited on signal 11 (Segmentation fault). -------------------------------------------------------------------------- Running this under gdb, it seems that the backtrace just points to the ucs signal handler, and that the cause of the segv is there (ompi/mca/mtl/ofi/mtl_ofi.h:107) : } else if (OPAL_UNLIKELY(ret == -FI_EAVAIL)) { /** * An error occured and is being reported via the CQ. * Read the error and forward it to the upper layer. */ [...] ret = ofi_req->error_callback(&error, ofi_req); with ofi_req->error_callback being unfortunately NULL. Is it really just me doing something absolutely silly, or is it something that ought to be fixed ? Best, E.
#include <stdio.h> #include <stdlib.h> #include <string.h> #include <mpi.h> int main(int argc, char * argv[]) { size_t chunk = 3<<29; if (argc > 1) chunk = atol(argv[1]); int rank; int size; MPI_Init(&argc, &argv); MPI_Comm_set_errhandler(MPI_COMM_WORLD, MPI_ERRORS_RETURN); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); void * data = malloc(chunk); memset(data, 0x42, chunk); if (rank == 0) { MPI_Send(data, chunk, MPI_BYTE, 1, 0xbeef, MPI_COMM_WORLD); } else { MPI_Recv(data, chunk, MPI_BYTE, 0, 0xbeef, MPI_COMM_WORLD, MPI_STATUS_IGNORE); } MPI_Barrier(MPI_COMM_WORLD); printf("ok\n"); MPI_Finalize(); }
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users