Barry, Sorry, I must have missed this -- I really ought to make a better filter for catching email like this. I think using NaNs is an excellent solution, in fact, I was proposing it a few months ago here :-) http://lists.mcs.anl.gov/pipermail/petsc-dev/2015-February/016958.html It ensures that the error is collective (the norm reduction will ensure every rank gets a NaN), the "error condition" is cleared automatically on the next MatMult, etc. I'm all for it. Should I put it in?
Dmitry. On Wed, Apr 29, 2015 at 8:26 PM Barry Smith <[email protected]> wrote: > > Dmitry, > > I haven't heard back from you on this. Any thoughts? > > Barry > > > On Apr 20, 2015, at 6:23 PM, Barry Smith <[email protected]> wrote: > > > > > > Dmitry, > > > > Rather than introducing another whole complexity of flags for > indicating domain errors in user functions just do the following. > > > > 1) just stick a Nan into the functions result > > 2) remove the VecValidValues() at the END of routines like MatMult() > > 3) when Nan or Inf pop up in Krylov methods (which will happen within > VecNorm or VecDot() and thus we get free collective knowledge of the > problem even if it happened on only one node), generate the appropriate > KSP_DIVERGED_NANORINF. This is already handled sometimes (most of the > time?), for example in KSPSolve_CG is code > > ierr = VecXDot(Z,R,&beta);CHKERRQ(ierr); /* beta <- z'*r > */ > > if (PetscIsInfOrNanScalar(beta)) { > > if (ksp->errorifnotconverged) > SETERRQ(PetscObjectComm((PetscObject)ksp),PETSC_ERR_NOT_CONVERGED,"KSPSolve > has not converged due to Nan or Inf inner product"); > > else { > > ksp->reason = KSP_DIVERGED_NANORINF; > > PetscFunctionReturn(0); > > } > > } > > > > 4) SNES already handles failed to converge KSP and > > 5 ) TS already handles failed to converged SNES; by, for example, > cutting the timestep. > > > > Barry > > > > > >
