I met a case where MUMPS numeric factorization returned an error code -9 in 
mumps->id.INFOG(1) but A->erroriffailure was false in the following code in 
mumps.c

1199: 
PetscErrorCode<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscErrorCode.html#PetscErrorCode>
 
MatFactorNumeric_MUMPS(Mat<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/Mat.html#Mat>
 
F,Mat<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/Mat.html#Mat>
 A,const 
MatFactorInfo<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatFactorInfo.html#MatFactorInfo>
 *info)

1200: {
...
1227:   PetscMUMPS_c(mumps);
1228:   if (mumps->id.INFOG(1) < 0) {
1229:     if (A->erroriffailure) {
1230:       
SETERRQ2<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/SETERRQ2.html#SETERRQ2>(PETSC_COMM_SELF<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PETSC_COMM_SELF.html#PETSC_COMM_SELF>,PETSC_ERR_LIB,"Error
 reported by MUMPS in numerical factorization phase: INFOG(1)=%d, 
INFO(2)=%d\n",mumps->id.INFOG(1),mumps->id.INFO(2));
1231:     } else {
1232:       if (mumps->id.INFOG(1) == -10) { /* numerically singular matrix */
1233:         PetscInfo2(F,"matrix is numerically singular, INFOG(1)=%d, 
INFO(2)=%d\n",mumps->id.INFOG(1),mumps->id.INFO(2));
1234:         F->factorerrortype = MAT_FACTOR_NUMERIC_ZEROPIVOT;


The code continued to KSPSolve and finished successfully (with wrong answer). 
The user did not call KSPGetConvergedReason() after KSPSolve. I found I had  to 
either add -ksp_error_if_not_converged or call 
KSPSetErrorIfNotConverged(ksp,PETSC_TRUE) to make the code fail.

Is it expected?  In my view, it is dangerous. If MUMPS fails in one stage, 
PETSc should not proceed to the next stage because it may hang there.

Reply via email to