Thanks a lot for your explanation, Barry, This makes sense!
Fande, On Fri, Feb 24, 2017 at 1:56 PM, Barry Smith <bsm...@mcs.anl.gov> wrote: > > Fande, > > Yes. Say one is doing a timestepping and using a direct solver. With > time stepping we do not want to necessarily stop (in fact we almost never) > on a failed solve (due to for example a failed factorization). We want the > code to continue up the stack until it gets to the time stepper with the > information that the linear solver failed; then the timestepper can decide > what to do, decrease the time step and try again (common) or switch to > another method etc. The same could be true for linear solvers called from > within optimization routines. > > The reason we do the funny business with putting infinity into the > vector is so that errors that occur on any process get propagated to all > processes during the next inner product or norm computation, otherwise we > would need to have some "side channel" communication which MPI doesn't > really provide to propagate errors to all processes, this would be a lot of > performance crippling communication that would have to take place during > the entire solution process. Plus our approach since it doesn't change the > flow of control cannot result in lost memory etc. Note that C (and MPI) > don't have any exception mechanism that could be used to handle these > "exceptional cases" directly. > > Note that you can use options like -snes_error_if_not_converged or > -ksp_error_if_not_converged to force the error to be set as soon as any > problem with convergence or zero factorization is found (good for > debugging, but not usually for production unless production is solving a > linear system only). > > Barry > > > On Feb 24, 2017, at 2:40 PM, Kong, Fande <fande.k...@inl.gov> wrote: > > > > Hi All, > > > > In MatSolve(), there is a piece of code: > > > > if (mat->errortype) { > > ierr = PetscInfo1(mat,"MatFactorError %D\n",mat->errortype);CHKERRQ( > ierr); > > ierr = VecSetInf(x);CHKERRQ(ierr); > > } else { > > ierr = (*mat->ops->solve)(mat,b,x);CHKERRQ(ierr); > > } > > > > If a direct solver such as LU or superlu fails to create a > factorization, why we do not stop here and throw out an error message here > or earlier? Now we just let solver keep doing garbage computation, and > finally we have a solution like this: > > > > Vec Object: 1 MPI processes > > type: mpi > > Process [0] > > inf. > > inf. > > inf. > > inf. > > inf. > > inf. > > inf. > > inf. > > inf. > > inf. > > inf. > > inf. > > inf. > > inf. > > inf. > > inf. > > > > Any particular reason to handle the thing in this way? > > > > Fande, > > > > > >