Fande,
Yes. Say one is doing a timestepping and using a direct solver. With time
stepping we do not want to necessarily stop (in fact we almost never) on a
failed solve (due to for example a failed factorization). We want the code to
continue up the stack until it gets to the time stepper with the information
that the linear solver failed; then the timestepper can decide what to do,
decrease the time step and try again (common) or switch to another method etc.
The same could be true for linear solvers called from within optimization
routines.
The reason we do the funny business with putting infinity into the vector
is so that errors that occur on any process get propagated to all processes
during the next inner product or norm computation, otherwise we would need to
have some "side channel" communication which MPI doesn't really provide to
propagate errors to all processes, this would be a lot of performance crippling
communication that would have to take place during the entire solution process.
Plus our approach since it doesn't change the flow of control cannot result in
lost memory etc. Note that C (and MPI) don't have any exception mechanism that
could be used to handle these "exceptional cases" directly.
Note that you can use options like -snes_error_if_not_converged or
-ksp_error_if_not_converged to force the error to be set as soon as any problem
with convergence or zero factorization is found (good for debugging, but not
usually for production unless production is solving a linear system only).
Barry
> On Feb 24, 2017, at 2:40 PM, Kong, Fande <[email protected]> wrote:
>
> Hi All,
>
> In MatSolve(), there is a piece of code:
>
> if (mat->errortype) {
> ierr = PetscInfo1(mat,"MatFactorError %D\n",mat->errortype);CHKERRQ(ierr);
> ierr = VecSetInf(x);CHKERRQ(ierr);
> } else {
> ierr = (*mat->ops->solve)(mat,b,x);CHKERRQ(ierr);
> }
>
> If a direct solver such as LU or superlu fails to create a factorization, why
> we do not stop here and throw out an error message here or earlier? Now we
> just let solver keep doing garbage computation, and finally we have a
> solution like this:
>
> Vec Object: 1 MPI processes
> type: mpi
> Process [0]
> inf.
> inf.
> inf.
> inf.
> inf.
> inf.
> inf.
> inf.
> inf.
> inf.
> inf.
> inf.
> inf.
> inf.
> inf.
> inf.
>
> Any particular reason to handle the thing in this way?
>
> Fande,
>
>