Thanks, the problem is still there. I used: cout << "In main(), I am rank " << myRank << " , I am before MPI_Barrier(MPI_COMM_WORLD). \n\n" << endl ; MPI_Barrier(MPI_COMM_WORLD); cout << "In main(), I am rank " << myRank << " , I am before MPI_Finalize() and after MPI_Barrier(MPI_COMM_WORLD). \n\n" << endl ; MPI_Finalize(); cout << "In main(), I am rank " << myRank << " , I am after MPI_Finalize(), then return 0 . \n\n" << endl ; return 0 ; Only process 0 returns. Other processes are still struck in MPI_Finalize(). Any help is appreciated. JACK Oct. 25 2010
From: solarbik...@gmail.com List-Post: users@lists.open-mpi.org Date: Mon, 25 Oct 2010 08:27:19 -0700 To: us...@open-mpi.org Subject: Re: [OMPI users] Open MPI program cannot complete I think I got this problem before. Put a mpi_barrier(mpi_comm_world) before mpi_finalize for all processes. For me, mpi terminates nicely only when all process are calling mpi_finalize the same time. So I do it for all my programs. On Mon, Oct 25, 2010 at 7:13 AM, Jack Bryan <dtustud...@hotmail.com> wrote: Thanks, But, I have put a mpi_waitall(request) before cout << " I am rank " << rank << " I am before MPI_Finalize()" << endl; If the above sentence has been printed out, it means that all requests have been checked and finished. right ? What may be the possible reasons for that stuck ? Any help is appreciated. Jack Oct. 25 2010 List-Post: users@lists.open-mpi.org Date: Mon, 25 Oct 2010 05:32:44 -0400 From: terry.don...@oracle.com To: us...@open-mpi.org Subject: Re: [OMPI users] Open MPI program cannot complete So what you are saying is *all* the ranks have entered MPI_Finalize and only a subset has exited per placing prints before and after MPI_Finalize. Good. So my guess is that the processes stuck in MPI_Finalize have a prior MPI request outstanding that for whatever reason is unable to complete. So I would first look at all the MPI requests and make sure they completed. --td On 10/25/2010 02:38 AM, Jack Bryan wrote: thanks I found a problem: I used: cout << " I am rank " << rank << " I am before MPI_Finalize()" << endl; MPI_Finalize(); cout << " I am rank " << rank << " I am after MPI_Finalize()" << endl; return 0; I can get the output " I am rank 0 (1, 2, ....) I am before MPI_Finalize() ". and " I am rank 0 I am after MPI_Finalize() " But, other processes do not printed out "I am rank ... I am after MPI_Finalize()" . It is weird. The process has reached the point just before MPI_Finalize(), why they are hanged there ? Are there other better ways to check this ? Any help is appreciated. thanks Jack Oct. 25 2010 From: solarbik...@gmail.com Date: Sun, 24 Oct 2010 19:47:54 -0700 To: us...@open-mpi.org Subject: Re: [OMPI users] Open MPI program cannot complete how do you know all process call mpi_finalize? did you have all of them print out something before they call mpi_finalize? I think what Gustavo is getting at is maybe you had some MPI calls within your snippets that hangs your program, thus some of your processes never called mpi_finalize. On Sun, Oct 24, 2010 at 6:59 PM, Jack Bryan <dtustud...@hotmail.com> wrote: Thanks, But, my code is too long to be posted. What are the common reasons of this kind of problems ? Any help is appreciated. Jack Oct. 24 2010 > From: g...@ldeo.columbia.edu > Date: Sun, 24 Oct 2010 18:09:52 -0400 > To: us...@open-mpi.org > Subject: Re: [OMPI users] Open MPI program cannot complete > > Hi Jack > > Your code snippet is too terse, doesn't show the MPI calls. > It is hard to guess what is the problem this way. > > Gus Correa > On Oct 24, 2010, at 5:43 PM, Jack Bryan wrote: > > > Thanks for the reply. > > But, I use mpi_waitall() to make sure that all MPI communications have been done before a process call MPI_Finalize() and returns. > > > > Any help is appreciated. > > > > thanks > > > > Jack > > > > Oct. 24 2010 > > > > > From: g...@ldeo.columbia.edu > > > Date: Sun, 24 Oct 2010 17:31:11 -0400 > > > To: us...@open-mpi.org > > > Subject: Re: [OMPI users] Open MPI program cannot complete > > > > > > Hi Jack > > > > > > It may depend on "do some things". > > > Does it involve MPI communication? > > > > > > Also, why not put MPI_Finalize();return 0 outside the ifs? > > > > > > Gus Correa > > > > > > On Oct 24, 2010, at 2:23 PM, Jack Bryan wrote: > > > > > > > Hi > > > > > > > > I got a problem of open MPI. > > > > > > > > My program has 5 processes. > > > > > > > > All of them can run MPI_Finalize() and return 0. > > > > > > > > But, the whole program cannot be completed. > > > > > > > > In the MPI cluster job queue, it is strill in running status. > > > > > > > > If I use 1 process to run it, no problem. > > > > > > > > Why ? > > > > > > > > My program: > > > > > > > > int main (int argc, char **argv) > > > > { > > > > > > > > MPI_Init(&argc, &argv); > > > > MPI_Comm_rank(MPI_COMM_WORLD, &myRank); > > > > MPI_Comm_size(MPI_COMM_WORLD, &mySize); > > > > MPI_Comm world; > > > > world = MPI_COMM_WORLD; > > > > > > > > if (myRank == 0) > > > > { > > > > do some things. > > > > } > > > > > > > > if (myRank != 0) > > > > { > > > > do some things. > > > > MPI_Finalize(); > > > > return 0 ; > > > > } > > > > if (myRank == 0) > > > > { > > > > MPI_Finalize(); > > > > return 0; > > > > } > > > > > > > > } > > > > > > > > And, some output files get wrong codes, which can not be readible. > > > > In 1-process case, the program can print correct results to these output files . > > > > > > > > Any help is appreciated. > > > > > > > > thanks > > > > > > > > Jack > > > > > > > > Oct. 24 2010 > > > > > > > > _______________________________________________ > > > > users mailing list > > > > us...@open-mpi.org > > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- David Zhang University of California, San Diego _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Terry D. Dontje | Principal Software Engineer Developer Tools Engineering | +1.781.442.2631 Oracle - Performance Technologies 95 Network Drive, Burlington, MA 01803 Email terry.don...@oracle.com _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- David Zhang University of California, San Diego _______________________________________________ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users