https://trac.mpich.org/projects/mpich/ticket/2038 has the patches.
Jeff On Fri, Feb 21, 2014 at 3:47 PM, Jeff Hammond <[email protected]> wrote: > Barry: > > Can you tolerate the following workaround for Hydra's error cleanup or > do you need it to be internal? I presume you know enough bash to > generalize a.sh appropriately. > > alcfwl181:build jhammond$ cat a.sh > #!/bin/sh > $1 > true > alcfwl181:build jhammond$ mpiexec -n 1 -env > MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 0 ./a.sh ./a.out > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > alcfwl181:build jhammond$ mpiexec -n 1 -env > MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 1 ./a.sh ./a.out > > alcfwl181:build jhammond$ mpiexec -n 1 -env > MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 1 ./a.out > > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = PID 61123 RUNNING AT alcfwl181.alcf.anl.gov > = EXIT CODE: 1 > = CLEANING UP REMAINING PROCESSES > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > =================================================================================== > alcfwl181:build jhammond$ mpiexec -n 1 -env > MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 0 ./a.out > application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 > > =================================================================================== > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > = PID 61126 RUNNING AT alcfwl181.alcf.anl.gov > = EXIT CODE: 1 > = CLEANING UP REMAINING PROCESSES > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > =================================================================================== > > On Fri, Feb 21, 2014 at 3:10 PM, Jeff Hammond <[email protected]> wrote: >> Barry: >> >> Would the following behavior be acceptable to you? I have only made >> the changes in MPI but am looking at the process manager now. >> >> Jeff >> >> >> # Without the process manager >> >> alcfwl181:build jhammond$ export MPIR_CVAR_SUPPRESS_ABORT_MESSAGE=0 >> alcfwl181:build jhammond$ ./a.out >> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 >> alcfwl181:build jhammond$ export MPIR_CVAR_SUPPRESS_ABORT_MESSAGE=1 >> alcfwl181:build jhammond$ ./a.out >> >> alcfwl181:build jhammond$ unset MPIR_CVAR_SUPPRESS_ABORT_MESSAGE >> alcfwl181:build jhammond$ ./a.out >> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 >> >> # With the process manager >> >> alcfwl181:build jhammond$ mpiexec -n 1 -env >> MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 0 ./a.out >> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 >> >> =================================================================================== >> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >> = PID 61023 RUNNING AT alcfwl181.alcf.anl.gov >> = EXIT CODE: 1 >> = CLEANING UP REMAINING PROCESSES >> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES >> =================================================================================== >> alcfwl181:build jhammond$ mpiexec -n 1 -env >> MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 1 ./a.out >> >> >> =================================================================================== >> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >> = PID 61026 RUNNING AT alcfwl181.alcf.anl.gov >> = EXIT CODE: 1 >> = CLEANING UP REMAINING PROCESSES >> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES >> =================================================================================== >> alcfwl181:build jhammond$ mpiexec -n 1 ./a.out >> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0 >> >> =================================================================================== >> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >> = PID 61032 RUNNING AT alcfwl181.alcf.anl.gov >> = EXIT CODE: 1 >> = CLEANING UP REMAINING PROCESSES >> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES >> =================================================================================== >> >> >> >> On Thu, Feb 20, 2014 at 11:33 AM, Barry Smith <[email protected]> wrote: >>> >>> Is there any way to turn off MPICH (and others) printing messages about >>> MPI_Abort? We have already prepared and presented useful error messages to >>> the user about the situation and would like to avoid having these >>> additional messages printed (that often make the situation look worse than >>> it is) >>> >>> Thanks >>> >>> Barry >>> >>> application called MPI_Abort(MPI_COMM_WORLD, 56) - process 0 >>> [cli_0]: aborting job: >>> application called MPI_Abort(MPI_COMM_WORLD, 56) - process 0 >>> >>> ==================================================================mailto:[email protected]================= >>> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES >>> = EXIT CODE: 56 >>> = CLEANING UP REMAINING PROCESSES >>> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES >>> =================================================================================== >>> >>> >>> >>> >>> _______________________________________________ >>> discuss mailing list [email protected] >>> To manage subscription options or unsubscribe: >>> https://lists.mpich.org/mailman/listinfo/discuss >> >> >> >> -- >> Jeff Hammond >> [email protected] > > > > -- > Jeff Hammond > [email protected] -- Jeff Hammond [email protected]
