On Apr 14, 2011, at 4:02 AM, N.M. Maclaren wrote:

> ...  It's hopeless, and whatever you do will be wrong for many
> people.  ...

I think that sums it up pretty well.  :-)

It does seem a little strange that the scenario you describe somewhat implies 
that one process is calling MPI_Finalize loooong before the others do.  
Specifically, the user is concerned with tying up resources after one process 
has called Finalize -- which implies that the others may continue on for a 
while.  It's not invalid, of course, but it is a little unusual.

I see two possibilities here:

1. have the user delay calling MPI_Finalize in the application until it can do 
the test that indicates that the rest of the job should be aborted (i.e., so 
that it can still call MPI_Abort if it wants to).  Don't forget that an 
implementation is allowed to block in MPI_Finalize until all processes call 
MPI_Finalize, anyway.

2. add an MCA param and/or orterun CLI option to abort a job if an MPI process 
terminates after MPI_Finalize with a nonzero exit status.

Just my $0.02.  :-)

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to