Indeed you proposed the exact thing. I would be happy if you tried to make a 
branch of master that used this approach.

  Barry

> On Apr 29, 2015, at 9:28 PM, Dmitry Karpeyev <[email protected]> wrote:
> 
> Barry,
> Sorry, I must have missed this -- I really ought to make a better filter for 
> catching email like this.
> I think using NaNs is an excellent solution, in fact, I was proposing it a 
> few months ago here :-)
> http://lists.mcs.anl.gov/pipermail/petsc-dev/2015-February/016958.html
> It ensures that the error is collective (the norm reduction will ensure every 
> rank gets a NaN), 
> the "error condition" is cleared automatically on the next MatMult, etc.
> I'm all for it.
> Should I put it in?
> 
> Dmitry.
> 
> On Wed, Apr 29, 2015 at 8:26 PM Barry Smith <[email protected]> wrote:
> 
>   Dmitry,
> 
>     I haven't heard back from you on this. Any thoughts?
> 
>   Barry
> 
> > On Apr 20, 2015, at 6:23 PM, Barry Smith <[email protected]> wrote:
> >
> >
> >  Dmitry,
> >
> >   Rather than introducing another whole complexity of flags for indicating 
> > domain errors in user functions just do the following.
> >
> >   1) just stick a Nan into the functions result
> >   2) remove the VecValidValues() at the END of routines like MatMult()
> >   3) when Nan or Inf pop up in Krylov methods (which will happen within 
> > VecNorm or VecDot() and thus we get free collective knowledge of the 
> > problem even if it happened on only one node), generate the appropriate 
> > KSP_DIVERGED_NANORINF. This is already handled sometimes (most of the 
> > time?), for example in KSPSolve_CG is code
> > ierr = VecXDot(Z,R,&beta);CHKERRQ(ierr);         /*  beta <- z'*r       */
> >    if (PetscIsInfOrNanScalar(beta)) {
> >      if (ksp->errorifnotconverged) 
> > SETERRQ(PetscObjectComm((PetscObject)ksp),PETSC_ERR_NOT_CONVERGED,"KSPSolve 
> > has not converged due to Nan or Inf inner product");
> >      else {
> >        ksp->reason = KSP_DIVERGED_NANORINF;
> >        PetscFunctionReturn(0);
> >      }
> >    }
> >
> >   4) SNES already handles failed to converge KSP and
> >   5 ) TS already handles failed to converged SNES; by, for example, cutting 
> > the timestep.
> >
> >  Barry
> >
> >
> 

Reply via email to