Ralph, did you get a chance to run the ibm/final test to see if these
changes fixed the problem? I just rebuilt the trunk and tried it and I
still get an exit status of 0 back. I will run it again to make sure I
have not made a mistake.
Rolf
On 04/26/10 23:43, Ralph Castain wrote:
Okay, this should finally be fixed. See the commit message for r23045
for an explanation.
It really wasn't anything in the cited changeset that caused the
problem. The root cause is that $#@$ abort file we dropped in the
session dir to indicate you called MPI_Abort vs trying to thoroughly
cleanup. Been biting us in the butt for years - finally removed it.
On Apr 26, 2010, at 12:58 PM, Rolf vandeVaart wrote:
The ibm/final test does not call MPI_Abort directly. It is calling
MPI_Barrier after MPI_Finalize is called, which is a no-no. This is
detected and eventually the library calls ompi_mpi_abort(). This is
very similar to MPI_Abort() which ultimately calls ompi_mpi_abort as
well. So, I guess I am saying for all intents and purposes, it calls
MPI_Abort.
Rolf
On 04/26/10 14:41, Ralph Castain wrote:
I'll try to keep it in mind as I continue the errmgr work. I gather these tests
all call MPI_Abort?
On Apr 26, 2010, at 12:31 PM, Rolf vandeVaart wrote:
With our MTT testing we have noticed a problem that has cropped up in the
trunk. There are some tests that are supposed to return a non-zero status
because they are getting errors, but are instead returning 0. This problem
does not exist in r23022 but does exist in r23023.
One can use the ibm/final test to reproduce the problem. An example of a
passing case followed by a failing case is shown below.
Ralph, you want me to open a ticket on this? Or do you just want to take a
look. I am asking you since you did the r23023 commit.
Rolf
TRUNK VERSION r23022:
[rolfv@burl-ct-x2200-6 environment]$ mpirun -np 1 -mca btl sm,self final
**************************************************************************
This test should generate a message about MPI is either not initialized or
has already been finialized.
ERRORS ARE EXPECTED AND NORMAL IN THIS PROGRAM!!
**************************************************************************
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[burl-ct-x2200-6:6072] Abort after MPI_FINALIZE completed successfully; not
able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------
[rolfv@burl-ct-x2200-6 environment]$ echo $status
1
[rolfv@burl-ct-x2200-6 environment]$
TRUNK VERSION r23023:
[rolfv@burl-ct-x2200-6 environment]$ mpirun -np 1 -mca btl sm,self final
**************************************************************************
This test should generate a message about MPI is either not initialized or
has already been finialized.
ERRORS ARE EXPECTED AND NORMAL IN THIS PROGRAM!!
**************************************************************************
*** The MPI_Barrier() function was called after MPI_FINALIZE was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[burl-ct-x2200-6:4089] Abort after MPI_FINALIZE completed successfully; not
able to guarantee that all other processes were killed!
[rolfv@burl-ct-x2200-6 environment]$ echo $status
0
[rolfv@burl-ct-x2200-6 environment]$
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
de...@open-mpi.org <mailto:de...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/devel
------------------------------------------------------------------------
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel