This problem resurfaced on the user list, so I dug around a bit and think I've 
figured it out using George's test code. The problem lies in the fact that the 
intercomm "merge" function can create a linkage between procs that was not 
reflected anywhere in a modex, and so at least some of the procs in the 
resulting communicator don't know how to talk to some of the new communicator's 
peers.

For example, consider the case where:

1. parent job A comm_spawns a process (job B) - these processes exchange modex 
and can communicate

2. parent job A now comm_spawns another process (job C) - again, these can 
communicate, but the proc in C knows nothing of B

3. do an intercomm merge across the communicators created by the two 
comm_spawns. This puts B and C into the same communicator, but they know 
nothing about how to talk to each other as they were not involved in any 
exchange of contact info. Hence, collectives on that communicator now fail.

I tried adding all known contact info (not just your own) into the modex, but 
that doesn't resolve the problem. It resulted in C knowing how to talk to B 
(because A knew when the comm_spawn was done), but B still has no idea how to 
talk to C as it didn't participate in the modex associated with step 2.

It seems to me that the solution is to have intercomm "merge" actually execute 
a modex to ensure that all procs in the new communicator know how to 
communicate with each other, but I readily admit I might be missing something.

Anyone have thoughts on this? It has come up twice now, so probably something 
worth addressing.


Begin forwarded message:

> From: Ralph Castain <r...@open-mpi.org>
> Date: October 25, 2011 10:08:00 AM MDT
> To: Open MPI Users <us...@open-mpi.org>
> Subject: Re: [OMPI users] Problem-Bug with MPI_Intercomm_create()
> 
> FWIW: I have tracked this problem down. The fix is a little more complicated 
> then I'd like, so I'm going to have to ping some other folks to ensure we 
> concur on the approach before doing something.
> 
> On Oct 25, 2011, at 8:20 AM, Ralph Castain wrote:
> 
>> I still see it failing the test George provided on the trunk. I'm unaware of 
>> anyone looking further into it, though, as the prior discussion seemed to 
>> just end.
>> 
>> On Oct 25, 2011, at 7:01 AM, orel wrote:
>> 
>>> Dears,
>>> 
>>> I try from several days to use advanced MPI2 features in the following 
>>> scenario :
>>> 
>>> 1) a master code A (of size NPA) spawns (MPI_Comm_spawn()) two slave
>>>   codes B (of size NPB) and C (of size NPC), providing intercomms A-B and 
>>> A-C ;
>>> 2) i create intracomm AB and AC by merging intercomms ;
>>> 3) then i create intercomm AB-C by calling MPI_Intercomm_create() by using 
>>> AC as bridge...
>>> 
>>>  MPI_Comm intercommABC; A: MPI_Intercomm_create(intracommAB, 0, 
>>> intracommAC, NPA, TAG,&intercommABC);
>>> B: MPI_Intercomm_create(intracommAB, 0, MPI_COMM_NULL, 0,TAG,&intercommABC);
>>> C: MPI_Intercomm_create(intracommC, 0, intracommAC, 0, TAG,&intercommABC);
>>> 
>>>    In these calls, A0 and C0 play the role of local leader for AB and C 
>>> respectively.
>>>    C0 and A0 play the roles of remote leader in bridge intracomm AC.
>>> 
>>> 3)  MPI_Barrier(intercommABC);
>>> 4)  i merge intercomm AB-C into intracomm ABC$
>>> 5)  MPI_Barrier(intracommABC);
>>> 
>>> My BUG: These calls success, but when i try to use intracommABC for a 
>>> collective communication like MPI_Barrier(),
>>>             i got the following error :
>>> 
>>> *** An error occurred in MPI_Barrier
>>> *** on communicator
>>> *** MPI_ERR_INTERN: internal error
>>> *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
>>> 
>>> 
>>> I try with OpenMPI trunk, 1.5.3, 1.5.4 and Mpich2-1.4.1p1
>>> 
>>> My code works perfectly if intracomm A, B and C are obtained by 
>>> MPI_Comm_split() instead of MPI_Comm_spawn() !!!!
>>> 
>>> 
>>> I found same problem in a previous thread of the OMPI Users mailing list :
>>> 
>>> => http://www.open-mpi.org/community/lists/users/2011/06/16711.php
>>> 
>>> Is that bug/problem is currently under investigation ? :-)
>>> 
>>> i can give detailed code, but the one provided by George Bosilca in this 
>>> previous thread provides same error...
>>> 
>>> Thank you to help me...
>>> 
>>> -- 
>>> Aurélien Esnard
>>> University Bordeaux 1 / LaBRI / INRIA (France)
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 

Reply via email to