I think I got this problem before. Put a mpi_barrier(mpi_comm_world) before mpi_finalize for all processes. For me, mpi terminates nicely only when all process are calling mpi_finalize the same time. So I do it for all my programs.
On Mon, Oct 25, 2010 at 7:13 AM, Jack Bryan <dtustud...@hotmail.com> wrote: > Thanks, > But, I have put a mpi_waitall(request) before > > cout << " I am rank " << rank << " I am before MPI_Finalize()" << endl; > > If the above sentence has been printed out, it means that all requests have > been checked and finished. right ? > > What may be the possible reasons for that stuck ? > > Any help is appreciated. > > Jack > > Oct. 25 2010 > * > * > ------------------------------ > Date: Mon, 25 Oct 2010 05:32:44 -0400 > From: terry.don...@oracle.com > > To: us...@open-mpi.org > Subject: Re: [OMPI users] Open MPI program cannot complete > > So what you are saying is *all* the ranks have entered MPI_Finalize and > only a subset has exited per placing prints before and after MPI_Finalize. > Good. So my guess is that the processes stuck in MPI_Finalize have a prior > MPI request outstanding that for whatever reason is unable to complete. So > I would first look at all the MPI requests and make sure they completed. > > --td > > On 10/25/2010 02:38 AM, Jack Bryan wrote: > > thanks > I found a problem: > > I used: > > cout << " I am rank " << rank << " I am before MPI_Finalize()" << > endl; > MPI_Finalize(); > cout << " I am rank " << rank << " I am after MPI_Finalize()" << endl; > return 0; > > I can get the output " I am rank 0 (1, 2, ....) I am before > MPI_Finalize() ". > > and > " I am rank 0 I am after MPI_Finalize() " > But, other processes do not printed out "I am rank ... I am after > MPI_Finalize()" . > > It is weird. The process has reached the point just before > MPI_Finalize(), why they are hanged there ? > > Are there other better ways to check this ? > > Any help is appreciated. > > thanks > > Jack > > Oct. 25 2010 > > ------------------------------ > From: solarbik...@gmail.com > Date: Sun, 24 Oct 2010 19:47:54 -0700 > To: us...@open-mpi.org > Subject: Re: [OMPI users] Open MPI program cannot complete > > how do you know all process call mpi_finalize? did you have all of them > print out something before they call mpi_finalize? I think what Gustavo is > getting at is maybe you had some MPI calls within your snippets that hangs > your program, thus some of your processes never called mpi_finalize. > > On Sun, Oct 24, 2010 at 6:59 PM, Jack Bryan <dtustud...@hotmail.com>wrote: > > Thanks, > > But, my code is too long to be posted. > > What are the common reasons of this kind of problems ? > > Any help is appreciated. > > Jack > > Oct. 24 2010 > > > From: g...@ldeo.columbia.edu > > Date: Sun, 24 Oct 2010 18:09:52 -0400 > > > To: us...@open-mpi.org > > Subject: Re: [OMPI users] Open MPI program cannot complete > > > > Hi Jack > > > > Your code snippet is too terse, doesn't show the MPI calls. > > It is hard to guess what is the problem this way. > > > > Gus Correa > > On Oct 24, 2010, at 5:43 PM, Jack Bryan wrote: > > > > > Thanks for the reply. > > > But, I use mpi_waitall() to make sure that all MPI communications have > been done before a process call MPI_Finalize() and returns. > > > > > > Any help is appreciated. > > > > > > thanks > > > > > > Jack > > > > > > Oct. 24 2010 > > > > > > > From: g...@ldeo.columbia.edu > > > > Date: Sun, 24 Oct 2010 17:31:11 -0400 > > > > To: us...@open-mpi.org > > > > Subject: Re: [OMPI users] Open MPI program cannot complete > > > > > > > > Hi Jack > > > > > > > > It may depend on "do some things". > > > > Does it involve MPI communication? > > > > > > > > Also, why not put MPI_Finalize();return 0 outside the ifs? > > > > > > > > Gus Correa > > > > > > > > On Oct 24, 2010, at 2:23 PM, Jack Bryan wrote: > > > > > > > > > Hi > > > > > > > > > > I got a problem of open MPI. > > > > > > > > > > My program has 5 processes. > > > > > > > > > > All of them can run MPI_Finalize() and return 0. > > > > > > > > > > But, the whole program cannot be completed. > > > > > > > > > > In the MPI cluster job queue, it is strill in running status. > > > > > > > > > > If I use 1 process to run it, no problem. > > > > > > > > > > Why ? > > > > > > > > > > My program: > > > > > > > > > > int main (int argc, char **argv) > > > > > { > > > > > > > > > > MPI_Init(&argc, &argv); > > > > > MPI_Comm_rank(MPI_COMM_WORLD, &myRank); > > > > > MPI_Comm_size(MPI_COMM_WORLD, &mySize); > > > > > MPI_Comm world; > > > > > world = MPI_COMM_WORLD; > > > > > > > > > > if (myRank == 0) > > > > > { > > > > > do some things. > > > > > } > > > > > > > > > > if (myRank != 0) > > > > > { > > > > > do some things. > > > > > MPI_Finalize(); > > > > > return 0 ; > > > > > } > > > > > if (myRank == 0) > > > > > { > > > > > MPI_Finalize(); > > > > > return 0; > > > > > } > > > > > > > > > > } > > > > > > > > > > And, some output files get wrong codes, which can not be readible. > > > > > In 1-process case, the program can print correct results to these > output files . > > > > > > > > > > Any help is appreciated. > > > > > > > > > > thanks > > > > > > > > > > Jack > > > > > > > > > > Oct. 24 2010 > > > > > > > > > > _______________________________________________ > > > > > users mailing list > > > > > us...@open-mpi.org > > > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > > > > > _______________________________________________ > > > > users mailing list > > > > us...@open-mpi.org > > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > -- > David Zhang > University of California, San Diego > > _______________________________________________ users mailing list > us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing > listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users > > > > -- > [image: Oracle] > Terry D. Dontje | Principal Software Engineer > Developer Tools Engineering | +1.781.442.2631 > Oracle * - Performance Technologies* > 95 Network Drive, Burlington, MA 01803 > Email terry.don...@oracle.com > > > > > _______________________________________________ users mailing list > us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- David Zhang University of California, San Diego