Hello,

I have been getting intermittent memory corruptions and segmentation faults 
while using Ialltoallw in OpenMPI v4.0.3. Valgrind also reports an invalid read 
in the "ompi_coll_base_retain_datatypes_w" function defined in 
"coll_base_util.c".

Running with a debug build of ompi an assertion fails as well:
base/coll_base_util.c:274: ompi_coll_base_retain_datatypes_w: Assertion 
`OPAL_OBJ_MAGIC_ID == ((opal_object_t *) (stypes[i]))->obj_magic_id' failed.
I think it is related to the fact that I am using a communicator created with 
2D MPI_Cart_create followed by getting 2 subcommunicators from MPI_Cart_sub, in 
some cases one of the dimensions is 1. In "ompi_coll_base_retain_datatypes_w" 
the neighbour count is used to find "rcount" and "scount" at line 267. In my 
bug case it returns 2 for both, but I believe it should be 1 since that is the 
comm size and the amount of memory I have allocated for sendtypes and 
recvtypes. Then, an invalid read happens at 274 and 280.

Regards,
Damian






<<attachment: mpi-info.zip>>

Reply via email to