https://trac.mpich.org/projects/mpich/ticket/2038 has the patches.

Jeff

On Fri, Feb 21, 2014 at 3:47 PM, Jeff Hammond <[email protected]> wrote:
> Barry:
>
> Can you tolerate the following workaround for Hydra's error cleanup or
> do you need it to be internal?  I presume you know enough bash to
> generalize a.sh appropriately.
>
> alcfwl181:build jhammond$ cat a.sh
> #!/bin/sh
> $1
> true
> alcfwl181:build jhammond$ mpiexec -n 1 -env
> MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 0 ./a.sh ./a.out
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> alcfwl181:build jhammond$ mpiexec -n 1 -env
> MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 1 ./a.sh ./a.out
>
> alcfwl181:build jhammond$ mpiexec -n 1 -env
> MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 1 ./a.out
>
>
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   PID 61123 RUNNING AT alcfwl181.alcf.anl.gov
> =   EXIT CODE: 1
> =   CLEANING UP REMAINING PROCESSES
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> ===================================================================================
> alcfwl181:build jhammond$ mpiexec -n 1 -env
> MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 0 ./a.out
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   PID 61126 RUNNING AT alcfwl181.alcf.anl.gov
> =   EXIT CODE: 1
> =   CLEANING UP REMAINING PROCESSES
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> ===================================================================================
>
> On Fri, Feb 21, 2014 at 3:10 PM, Jeff Hammond <[email protected]> wrote:
>> Barry:
>>
>> Would the following behavior be acceptable to you?  I have only made
>> the changes in MPI but am looking at the process manager now.
>>
>> Jeff
>>
>>
>> # Without the process manager
>>
>> alcfwl181:build jhammond$ export MPIR_CVAR_SUPPRESS_ABORT_MESSAGE=0
>> alcfwl181:build jhammond$ ./a.out
>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>> alcfwl181:build jhammond$ export MPIR_CVAR_SUPPRESS_ABORT_MESSAGE=1
>> alcfwl181:build jhammond$ ./a.out
>>
>> alcfwl181:build jhammond$ unset MPIR_CVAR_SUPPRESS_ABORT_MESSAGE
>> alcfwl181:build jhammond$ ./a.out
>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>>
>> # With the process manager
>>
>> alcfwl181:build jhammond$ mpiexec -n 1 -env
>> MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 0 ./a.out
>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>>
>> ===================================================================================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   PID 61023 RUNNING AT alcfwl181.alcf.anl.gov
>> =   EXIT CODE: 1
>> =   CLEANING UP REMAINING PROCESSES
>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>> ===================================================================================
>> alcfwl181:build jhammond$ mpiexec -n 1 -env
>> MPIR_CVAR_SUPPRESS_ABORT_MESSAGE 1 ./a.out
>>
>>
>> ===================================================================================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   PID 61026 RUNNING AT alcfwl181.alcf.anl.gov
>> =   EXIT CODE: 1
>> =   CLEANING UP REMAINING PROCESSES
>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>> ===================================================================================
>> alcfwl181:build jhammond$ mpiexec -n 1 ./a.out
>> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>>
>> ===================================================================================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   PID 61032 RUNNING AT alcfwl181.alcf.anl.gov
>> =   EXIT CODE: 1
>> =   CLEANING UP REMAINING PROCESSES
>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>> ===================================================================================
>>
>>
>>
>> On Thu, Feb 20, 2014 at 11:33 AM, Barry Smith <[email protected]> wrote:
>>>
>>>    Is there any way to turn off MPICH (and others) printing messages about 
>>> MPI_Abort?  We have already prepared and presented useful error messages to 
>>> the user about the situation and would like to avoid having these 
>>> additional messages printed (that often make the situation look worse than 
>>> it is)
>>>
>>>     Thanks
>>>
>>>    Barry
>>>
>>> application called MPI_Abort(MPI_COMM_WORLD, 56) - process 0
>>> [cli_0]: aborting job:
>>> application called MPI_Abort(MPI_COMM_WORLD, 56) - process 0
>>>
>>> ==================================================================mailto:[email protected]=================
>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>> =   EXIT CODE: 56
>>> =   CLEANING UP REMAINING PROCESSES
>>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>> ===================================================================================
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> discuss mailing list     [email protected]
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>>
>>
>> --
>> Jeff Hammond
>> [email protected]
>
>
>
> --
> Jeff Hammond
> [email protected]



-- 
Jeff Hammond
[email protected]

Reply via email to