Thomas,
i did double check this and
- there is no problem with MPI_Isend/MPI_Irecv (datatypes are correctly
retained/released, and this part is well "hidden" inside some macros)
- there is no such thing with libnbc (and hence the bug). depending on
the collective and the algo that will be chosed (depends on communicator
and message size) you may or may not hit the bug.
i opened https://github.com/open-mpi/ompi/issues/1304 in order to track
this issue, and will start making a proof of concept from now.
Cheers,
Gilles
On 1/13/2016 11:00 PM, Gilles Gouaillardet wrote:
Thomas,
thanks for the report,
at first glance, libnbc (the default module that implements non
blocking collective) does not retain/release datatypes, that is why
you ran into this kind of trouble.
I quickly checked the code, and it seems this kind of mechanism is
also missing for MPI_Isend/MPI_Irecv ...
I will investigate this further
Cheers,
Gilles
On Wednesday, January 13, 2016, Thomas Ponweiser
<thomas.ponwei...@risc-software.at
<javascript:_e(%7B%7D,'cvml','thomas.ponwei...@risc-software.at');>>
wrote:
Dear friends of Open MPI,
I am currently facing a problem in connection with MPI_Ibcast and
MPI_Type_free. I've been able to isolate the problem in a
minimalistic test program which I attached.
Maybe some of you can tell me what I am doing wrong or confirm
that this might be a bug in Open MPI (I am using version 1.10.1).
Here is what I am doing:
1) I have two struct types, foo_type and bar_type, as follows:
typedef struct
{
int v[6];
long l;
} foo_type;
typedef struct
{
int v[3];
foo_type foo;
} bar_type;
2) I am creating corresponding MPI types (foo_mpitype and
bar_mpitype) with MPI_Type_create_struct.
3) I am freeing foo_mpitype.
4) I am broadcasting a variable of type bar_type with MPI_Ibcast
(using count = 1 and datatype = bar_mpitype).
5) I am freeing bar_mpitype.
6) I am waiting for the completion of step 4) with MPI_Wait.
In step 6) I get a segmentation fault within MPI_Wait, but only if
the MPI job is larger than 4 processes.
Testing with MPICH 3.2, the program seems to work just fine.
I found out that swapping the steps 5) and 6) helps. But I think
this should not make any difference, according to the description
of MPI_Type_free at
http://www.mpi-forum.org/docs/mpi-1.1/mpi-11-html/node58.html:
"Any communication that is currently using this datatype will
complete normally." And: " Freeing a datatype does not affect any
other datatype that was built from the freed datatype."
(In fact, doing the same thing, that is MPI_IBcast followed by
MPI_Type_free followed by MPI_Wait, with foo_type and foo_mpitype
seems to work fine).
Thanks in advance for your help,
kind regards,
Thomas
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2016/01/28265.php