Hello, I'm currently working on a new RDMA device so that it will work with Open MPI platform.
During my work i came to a point where the program crashes but leaves not enough information for me to find the root cause. I tried to run gdb with the command line (mpirun -np 2 ring_c) but obviously that didn't help as it just tried to debug mpirun and not ring_c. Then i tried --debug option but it fails since it works with limited list of debuggers which unfortunately are not available in Fedora repo. This is the (part of the) crash log i see when running ring_c: [fc28-2:01086] *** Process received signal *** [fc28-2:01086] Signal: Segmentation fault (11) [fc28-2:01086] Signal code: Invalid permissions (2) [fc28-2:01086] Failing at address: 0x211dfc0 [fc28-2:01087] *** Process received signal *** [fc28-2:01087] Signal: Segmentation fault (11) [fc28-2:01087] Signal code: Invalid permissions (2) [fc28-2:01087] Failing at address: 0x2656fc0 [fc28-2:01087] [fc28-2:01086] [ 0] /lib64/libpthread.so.0(+0x11fc0)[0x7f2dd089afc0] [fc28-2:01086] [ 1] [0x211dfc0] [fc28-2:01086] *** End of error message *** [ 0] /lib64/libpthread.so.0(+0x11fc0)[0x7fcfc7553fc0] [fc28-2:01087] [ 1] [0x2656fc0] [fc28-2:01087] *** End of error message *** ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. ------------------------------------------------------- And here are the command line parameters i'm using: btl_base_verbose = 100 btl_openib_verbose = 100 btl = openib,self btl_openib_receive_queues = P,4096,8,6,4 btl_openib_cpc_include = rdmacm Appreciate any help here. Thanks, Yuval _______________________________________________ devel mailing list devel@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/devel