Thanks - that's exactly what I needed! Works as advertised. :o) On Thu, Nov 16, 2017 at 1:27 PM, Aurelien Bouteiller <boute...@icl.utk.edu> wrote:
> Adam. Your MPI program is incorrect. You need to replace the finalize on > the process that found the error with MPIAbort > > On Nov 16, 2017 10:38, "Adam Sylvester" <op8...@gmail.com> wrote: > >> I'm using Open MPI 2.1.0 for this but I'm not sure if this is more of an >> Open MPI-specific implementation question or what the MPI standard >> guarantees. >> >> I have an application which runs across multiple ranks, eventually >> reaching an MPI_Gather() call. Along the way, if one of the ranks >> encounters an error, it will call report the error to a log, call >> MPI_Finalize(), and exit with a non-zero return code. If this happens >> prior to the other ranks making it to the gather, it seems like mpirun >> notices this and the process ends on all ranks. This is what I want to >> happen - it's a legitimate error, so all processes should be freed up so >> the next job can run. It seems like if the other ranks make it into the >> MPI_Gather() before the one rank reports an error, the other ranks wait in >> the MPI_Gather() forever. >> >> Is there something simple I can do to guarantee that if any process calls >> MPI_Finalize(), all my ranks terminate? >> >> Thanks. >> -Adam >> >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://lists.open-mpi.org/mailman/listinfo/users >> > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users >
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users