Hi Edgar, sorry about the late response. I've been travelling without Internet access.
Well, I took the code Rodrigo provided and modified the client to make the dup after the creation of the new inter communicator, without 1 process. That is, I just replaced the lines 54-55 in the *removeRank* method with my if-else block. I tried this because call a new create after the first create did not work and I thought it would might be the communicator . So, I tried to duplicate the inter communicator to see if worked. Thanks. Thatyene Ramos. On Thu, Apr 5, 2012 at 5:10 PM, Edgar Gabriel <gabr...@cs.uh.edu> wrote: > so just to confirm, I ran our test suite for inter-communicator > collective operations and communicator duplication, and everything still > works. Specifically comm_dup on an intercommunicator is not > fundamentally broken, but worked for my tests. > > Having your code to see what your code precisely does would help me to > hunt the problem down, since I am otherwise not able to reproduce the > problem. > > Also, which version of Open MPI did you use? > > Thanks > Edgar > > On 4/4/2012 3:09 PM, Thatyene Louise Alves de Souza Ramos wrote: > > Hi Edgar, thank you for the response. > > > > Unfortunately, I've tried with and without this option. In both the > > result was the same... =( > > > > On Wed, Apr 4, 2012 at 5:04 PM, Edgar Gabriel <gabr...@cs.uh.edu > > <mailto:gabr...@cs.uh.edu>> wrote: > > > > did you try to start the program with the --mca coll ^inter switch > that > > I mentioned? Collective dup for intercommunicators should work, its > > probably again the bcast over a communicator of size 1 that is > causing > > the hang, and you could avoid it with the flag that I mentioned > above. > > > > Also, if you could attach your test code, that would help in hunting > > things down. > > > > Thanks > > Edgar > > > > On 4/4/2012 2:18 PM, Thatyene Louise Alves de Souza Ramos wrote: > > > Hi there. > > > > > > I've made some tests related to the problem reported by Rodrigo. > And I > > > think, I'd rather be wrong, that /collective calls like Create and > Dup > > > do not work with Inter communicators. I've try this in the client > > group:/ > > > > > > *MPI::Intercomm tmp_inter_comm;* > > > * > > > * > > > *tmp_inter_comm = server_comm.Create > (server_comm.Get_group().Excl(1, > > > &rank));* > > > * > > > * > > > *if(server_comm.Get_rank() != rank)* > > > *server_comm = tmp_inter_comm.Dup();* > > > *else* > > > *server_comm = MPI::COMM_NULL;* > > > * > > > * > > > The server_comm is the original inter communicator with the server > > group. > > > > > > I've noticed that the program hangs in the Dup call. It seems that > the > > > tmp_inter_comm created without one process still has this process, > > > because the other processes are waiting for it call the Dup too. > > > > > > What do you think? > > > > > > On Wed, Mar 28, 2012 at 6:03 PM, Edgar Gabriel <gabr...@cs.uh.edu > > <mailto:gabr...@cs.uh.edu> > > > <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu>>> wrote: > > > > > > it just uses a different algorithm which avoids the bcast on a > > > communicator of 1 (which is causing the problem here). > > > > > > Thanks > > > Edgar > > > > > > On 3/28/2012 12:08 PM, Rodrigo Oliveira wrote: > > > > Hi Edgar, > > > > > > > > I tested the execution of my code using the option -mca coll > > ^inter as > > > > you suggested and the program worked fine, even when I use 1 > > server > > > > instance. > > > > > > > > What is the modification caused by this parameter? I did not > > find an > > > > explanation about the utilization of the module coll inter. > > > > > > > > Thanks a lot for your attention and for the solution. > > > > > > > > Best regards, > > > > > > > > Rodrigo Oliveira > > > > > > > > On Tue, Mar 27, 2012 at 1:10 PM, Rodrigo Oliveira > > > > <rsilva.olive...@gmail.com > > <mailto:rsilva.olive...@gmail.com> <mailto:rsilva.olive...@gmail.com > > <mailto:rsilva.olive...@gmail.com>> > > > <mailto:rsilva.olive...@gmail.com > > <mailto:rsilva.olive...@gmail.com> > > > <mailto:rsilva.olive...@gmail.com > > <mailto:rsilva.olive...@gmail.com>>>> wrote: > > > > > > > > > > > > Hi Edgar. > > > > > > > > Thanks for the response. I just did not understand why > > the Barrier > > > > works before I remove one of the client processes. > > > > > > > > I tryed it with 1 server and 3 clients and it worked > > properly. > > > After > > > > I removed 1 of the clients, it stops working. So, the > > removal is > > > > affecting the functionality of Barrier, I guess. > > > > > > > > Anyone has an idea? > > > > > > > > > > > > On Mon, Mar 26, 2012 at 12:34 PM, Edgar Gabriel > > > <gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu> > > <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu>> > > > > <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu> > > <mailto:gabr...@cs.uh.edu <mailto:gabr...@cs.uh.edu>>>> wrote: > > > > > > > > I do not recall on what the agreement was on how to > > treat > > > the size=1 > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > users mailing list > > > > us...@open-mpi.org <mailto:us...@open-mpi.org> > > <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>> > > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org <mailto:us...@open-mpi.org> > > <mailto:us...@open-mpi.org <mailto:us...@open-mpi.org>> > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > > > > > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org <mailto:us...@open-mpi.org> > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > -- > > Edgar Gabriel > > Associate Professor > > Parallel Software Technologies Lab http://pstl.cs.uh.edu > > Department of Computer Science University of Houston > > Philip G. Hoffman Hall, Room 524 Houston, TX-77204, USA > > Tel: +1 (713) 743-3857 <tel:%2B1%20%28713%29%20743-3857> > > Fax: +1 (713) 743-3335 <tel:%2B1%20%28713%29%20743-3335> > > > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org <mailto:us...@open-mpi.org> > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > -- > Edgar Gabriel > Associate Professor > Parallel Software Technologies Lab http://pstl.cs.uh.edu > Department of Computer Science University of Houston > Philip G. Hoffman Hall, Room 524 Houston, TX-77204, USA > Tel: +1 (713) 743-3857 Fax: +1 (713) 743-3335 > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >