If I run a "standard" example, such as snes/examples/tutorials/ex19 it does 
not have this "double" business with fgmres/gmres 

  I think what you are seeing is related to where and how the shell 
preconditioner is working. By default GMRES uses left preconditioning (you can 
  change it to right with -ksp_pc_side right) while FMGRES has to use right 
preconditioner (it makes no mathematical sense with left preconditioning). 

  My first guess is if you use GMRES with -ksp_pc_side right you will see the 
double "business". 

  Right preconditioned GMRES/FGMRES solves    A B y= b  so an application of 
the operator is  A B but then once y is compute it needs to compute
x = B y which is another application of your preconditioner, hence requires 
another matrix free operation. The h is different for the two multiplies 
because 
the a is different. I don't see that this would happen for every KSP iteration, 
I think there will be one "extra" for each KSP solve. I you truly see two
MatMult_MFFD() for each GMRES/FGMRES I would run in the debugger, put a break 
point in MatMult_MFFD() to see exactly where each one is called.

   From the manual page

 MATMFFD_WP - Implements an alternative approach for computing the differencing 
parameter
        h used with the finite difference based matrix-free Jacobian.  This code
        implements the strategy of M. Pernice and H. Walker:

      h = error_rel * sqrt(1 + ||U||) / ||a||

      Notes:
        1) || U || does not change between linear iterations so is reused
        2) In GMRES || a || == 1 and so does not need to ever be computed 
except at restart
           when it is recomputed.



> On Aug 19, 2019, at 7:51 PM, Tang, Qi <[email protected]> wrote:
> 
> Thanks again, Barry. I am testing more based on your suggestions. 
> 
> One thing I do not understand is when I use -mat_mffd_type wp -mat_mffd_err 
> 1e-3 -ksp_type fgmres. It computes "h" of JFNK twice sequentially in each KSP 
> iteration. For instance, the output in info looks like
> ...
> [0] MatMult_MFFD(): Current differencing parameter: 1.953103182078e-09
> [0] MatMult_MFFD(): Current differencing parameter: 6.054449182838e-01
> ...
> I found it called MatMult_MFFD twice in a row. Do you happen to know why 
> petsc calls MatMult_MFFD twice when using fgmre? I also do not understand the 
> relationship between these two h. They are very off (because apparently ||a|| 
> is very different).
> 
> On the other hand, h is only computed once when I use gmres. The h there is 
> clearly scaled with mat_mffd_err. That is easy to understand. However, I 
> think I need to use fgmres because my preconditioner is not quite linear. 
> 
> BTW, my snes view is:
> 
> SNES Object: () 16 MPI processes
>   type: newtonls
>   maximum iterations=20, maximum function evaluations=10000
>   tolerances: relative=0.0001, absolute=0., solution=0.
>   total number of linear solver iterations=70
>   total number of function evaluations=149
>   norm schedule ALWAYS
>   Eisenstat-Walker computation of KSP relative tolerance (version 3)
>     rtol_0=0.3, rtol_max=0.9, threshold=0.1
>     gamma=0.9, alpha=1.5, alpha2=1.5
>   SNESLineSearch Object: () 16 MPI processes
>     type: bt
>       interpolation: cubic
>       alpha=1.000000e-04
>     maxstep=1.000000e+08, minlambda=1.000000e-12
>     tolerances: relative=1.000000e-08, absolute=1.000000e-15, 
> lambda=1.000000e-08
>     maximum iterations=40
>   KSP Object: () 16 MPI processes
>     type: fgmres
>       restart=50, using Classical (unmodified) Gram-Schmidt Orthogonalization 
> with no iterative refinement
>       happy breakdown tolerance 1e-30
>     maximum iterations=10000, initial guess is zero
>     tolerances:  relative=0.0564799, absolute=1e-50, divergence=10000.
>     right preconditioning
>     using UNPRECONDITIONED norm type for convergence test
>   PC Object: () 16 MPI processes
>     type: shell
>       JFNK preconditioner
>       No information available on the mfem::Solver
>       Number of preconditioners created by the factory 8
>     linear system matrix followed by preconditioner matrix:
>     Mat Object: 16 MPI processes
>       type: mffd
>       rows=787968, cols=787968
>         Matrix-free approximation:
>           err=0.001 (relative error in function evaluation)
>           Using wp compute h routine
>               Does not compute normU
>     Mat Object: 16 MPI processes
>       type: nest
>       rows=787968, cols=787968
>         Matrix object: 
>           type=nest, rows=3, cols=3 
>           MatNest structure: 
>           (0,0) : type=mpiaij, rows=262656, cols=262656 
>           (0,1) : type=mpiaij, rows=262656, cols=262656 
>           (0,2) : NULL 
>           (1,0) : type=mpiaij, rows=262656, cols=262656 
>           (1,1) : type=mpiaij, rows=262656, cols=262656 
>           (1,2) : NULL 
>           (2,0) : type=mpiaij, rows=262656, cols=262656 
>           (2,1) : NULL 
>           (2,2) : type=mpiaij, rows=262656, cols=262656 
> From: Smith, Barry F. <[email protected]>
> Sent: Thursday, August 15, 2019 3:30 AM
> To: Tang, Qi <[email protected]>
> Cc: [email protected] <[email protected]>
> Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK
>  
> 
> 
> > On Aug 15, 2019, at 12:36 AM, Tang, Qi <[email protected]> wrote:
> > 
> > Thanks, it works. snes_mf_jorge works for me.
> 
>   Great. 
> 
> > It appears to compute h in every ksp. 
> 
>    Each matrix vector product or each KSPSolve()? From the code looks like 
> each matrix-vector product. 
> 
> > 
> > Without -snes_mf_jorge, it is not working. For some reason, it only 
> > computes h once, but that h is bad. My gmres residual is not decaying. 
> 
> > 
> > Indeed, the noise in my function becomes larger when I refine the mesh. I 
> > think it makes sense as I use the same time step for different meshes (that 
> > is the goal of the preconditioning). However, even when the algorithm is 
> > working, sqrt(noise) is much less than the good mat_mffd_err I previously 
> > found (10^-6 vs 10^-3). I do not understand why. 
> 
>   I have no explanation. The details of the code that computes err are 
> difficult to trace exactly; would take a while. Perhaps there is some 
> parameter in there that is too "conservative"?
> 
> > 
> > Although snes_mf_jorge is working, it is very expensive, as it has to 
> > evaluate F many times when estimating h. Unfortunately, to achieve the 
> > nonlinearity, I have to assemble some operators inside my F. There seem no 
> > easy solutions. 
> > 
> > I will try to compute h multiple times without using snes_mf_jorge. But let 
> > me know if you have other suggestions. Thanks!
> 
>    Yes, it seems to me computing the err less often would be a way to make 
> the code go faster. I looked at the code more closely and noticed a couple of 
> things. 
> 
>     if (ctx->jorge) {
>       ierr = 
> SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr);
> 
>       /* Use the Brown/Saad method to compute h */
>     } else {
>       /* Compute error if desired */
>       ierr = SNESGetIterationNumber(snes,&iter);CHKERRQ(ierr);
>       if ((ctx->need_err) || ((ctx->compute_err_freq) && 
> (ctx->compute_err_iter != iter) && (!((iter-1)%ctx->compute_err_freq)))) {
>         /* Use Jorge's method to compute noise */
>         ierr = 
> SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr);
> 
>         ctx->error_rel = PetscSqrtReal(noise);
> 
>         ierr = PetscInfo3(snes,"Using Jorge's noise: noise=%g, 
> sqrt(noise)=%g, 
> h_more=%g\n",(double)noise,(double)ctx->error_rel,(double)h);CHKERRQ(ierr);
> 
>         ctx->compute_err_iter = iter;
>         ctx->need_err         = PETSC_FALSE;
>       }
> 
> So if jorge is set it uses the jorge algorithm for each matrix multiple to 
> compute a new err and h. If jorge is not set but -snes_mf_compute_err is set 
> then it computes a new err and h depending on the parameters  (you can run 
> with -info and grep for noise to see the PetscInfo lines printed (maybe add 
> iter to the output to see how often it is recomputed)
> 
> if ((ctx->need_err) || ((ctx->compute_err_freq) && (ctx->compute_err_iter != 
> iter) && (!((iter-1)%ctx->compute_err_freq)))) {
> 
> so the compute_err_freq determines when the err and h are recomputed. The 
> logic is a bit strange so I cannot decipher exactly how often it is 
> recomputing. I'm guess at the first Newton iteration and then some number of 
> iterations later. 
> 
>    You could try to rig it so that it is at every new Newton step (this would 
> mean when computing f(x + h a) - f(x) for each new x it will recompute I 
> think). 
> 
> 
>    A more "research type" approach to try to reduce the work to a reasonable 
> level would be to keep the same err until "things start to go bad" and then 
> recompute it. But how to measure "when things start to go bad?" It is related 
> to GMRES stagnating, so you could for example track the convergence of GMRES 
> and if it is less than expected, kill the linear solve, reset the computation 
> of err and then start the KSP at the same point as before. But this could be 
> terribly expensive since all the steps in the most recent KSP are lost. 
> 
>   Another possibility is to assume that the err is valid in a certain size 
> ball around the current x. When x + ha or the new x is outside that ball then 
> recompute the err. But how to choose the ball size and is there a way to 
> adjust the ball size depending on how the computations proceed. For example 
> if everything is going great you could slowly increase the size of the ball 
> and hope for the best; but how to detect if the ball got too big (as above 
> but terribly expensive)? 
> 
>    Track the size of the err each time it is computed. If it stays about the 
> same for a small number of times just freeze it for a while at that value? 
> But again how to determine when it is no longer good?
> 
>   Just a few wild thoughts, I really have no solid ideas on how to reduce the 
> work requirements.
> 
>    Barry
> 
> 
>    
> 
>    
> 
> > 
> > Qi
> > 
> >  
> > From: Smith, Barry F. <[email protected]>
> > Sent: Wednesday, August 14, 2019 10:57 PM
> > To: Tang, Qi <[email protected]>
> > Cc: [email protected] <[email protected]>
> > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK
> >  
> > 
> > 
> > > On Aug 14, 2019, at 9:41 PM, Tang, Qi <[email protected]> wrote:
> > > 
> > > Thanks for the help, Barry. I tired both ds and wp, and again it depends 
> > > on if I could find the correct parameter set. It is getting harder as I 
> > > refine the mesh.
> > > 
> > > So I try to use SNESDefaultMatrixFreeCreate2, SNESMatrixFreeMult2_Private 
> > > or SNESDiffParameterCompute_More in mfem. But it looks like these 
> > > functions are not in linked in petscsnes.h. How could I call them?
> > 
> >    They may not be listed in petscsnes.h but I think they should be in the 
> > library.  You can just stick the prototypes for the functions anywhere you 
> > need them for now.
> > 
> > 
> >  You should be able to use
> > 
> >   ierr = PetscOptionsInt("-snes_mf_version","Matrix-Free routines version 1 
> > or 2","None",snes->mf_version,&snes->mf_version,0);CHKERRQ(ierr);
> > 
> > then this info is passed with 
> > 
> >   if (snes->mf) {
> >     ierr = SNESSetUpMatrixFree_Private(snes, snes->mf_operator, 
> > snes->mf_version);CHKERRQ(ierr);
> >   }
> > 
> > this routine has
> > 
> >   if (version == 1) {
> >     ierr = MatCreateSNESMF(snes,&J);CHKERRQ(ierr);
> >     ierr = 
> > MatMFFDSetOptionsPrefix(J,((PetscObject)snes)->prefix);CHKERRQ(ierr);
> >     ierr = MatSetFromOptions(J);CHKERRQ(ierr);
> >   } else if (version == 2) {
> >     if (!snes->vec_func) 
> > SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_WRONGSTATE,"SNESSetFunction() must be 
> > called first");
> > #if !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_REAL_SINGLE) && 
> > !defined(PETSC_USE_REAL___FLOAT128) && !defined(PETSC_USE_REAL___FP16)
> >     ierr = 
> > SNESDefaultMatrixFreeCreate2(snes,snes->vec_func,&J);CHKERRQ(ierr);
> > #else
> > 
> > and this routine has
> > 
> >   ierr = VecDuplicate(x,&mfctx->w);CHKERRQ(ierr);
> >   ierr = PetscObjectGetComm((PetscObject)x,&comm);CHKERRQ(ierr);
> >   ierr = VecGetSize(x,&n);CHKERRQ(ierr);
> >   ierr = VecGetLocalSize(x,&nloc);CHKERRQ(ierr);
> >   ierr = MatCreate(comm,J);CHKERRQ(ierr);
> >   ierr = MatSetSizes(*J,nloc,n,n,n);CHKERRQ(ierr);
> >   ierr = MatSetType(*J,MATSHELL);CHKERRQ(ierr);
> >   ierr = MatShellSetContext(*J,mfctx);CHKERRQ(ierr);
> >   ierr = MatShellSetOperation(*J,MATOP_MULT,(void 
> > (*)(void))SNESMatrixFreeMult2_Private);CHKERRQ(ierr);
> >   ierr = MatShellSetOperation(*J,MATOP_DESTROY,(void 
> > (*)(void))SNESMatrixFreeDestroy2_Private);CHKERRQ(ierr);
> >   ierr = MatShellSetOperation(*J,MATOP_VIEW,(void 
> > (*)(void))SNESMatrixFreeView2_Private);CHKERRQ(ierr);
> >   ierr = MatSetUp(*J);CHKERRQ(ierr);
> > 
> > 
> > 
> > 
> > 
> > 
> > > 
> > > Also, could I call SNESDiffParameterCompute_More in snes_monitor? But it 
> > > needs "a" in (F(u + ha) - F(u)) /h as the input. So I am not sure how I 
> > > could properly use it. Maybe use SNESDefaultMatrixFreeCreate2 inside 
> > > SNESSetJacobian would be easier to try?
> > 
> > If you use the flag -snes_mf_noise_file filename when it runs it will save 
> > all the noise information it computes along the way to that file (yes it is 
> > crude and doesn't match the PETSc Viewer/monitor style but it should work).
> > 
> >   Thus I think you can use it and get all the possible monitoring 
> > information without actually writing any code. Just
> > 
> > -snes_mf_version 2
> > -snes_mf_noise_file  filename
> > 
> > 
> > 
> >    Barry
> > 
> > > 
> > > Thanks again,
> > > Qi
> > > 
> > > From: Smith, Barry F. <[email protected]>
> > > Sent: Tuesday, August 13, 2019 9:07 PM
> > > To: Tang, Qi <[email protected]>
> > > Cc: [email protected] <[email protected]>
> > > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK
> > >  
> > > 
> > > 
> > > > On Aug 13, 2019, at 7:27 PM, Tang, Qi via petsc-users 
> > > > <[email protected]> wrote:
> > > > 
> > > > Hi,
> > > > I am using JFNK, inexact Newton and a shell "physics-based" 
> > > > preconditioning to solve some multiphysics problems. I have been 
> > > > playing with mat_mffd_err, and it gives me some results I do not fully 
> > > > understand. 
> > > > 
> > > > I believe the default value of mat_mffd_err is 10^-8 for double 
> > > > precision, which seem too large for my problems. When I test a problem 
> > > > essentially still in the linear regime, I expect my converged 
> > > > "unpreconditioned resid norm of KSP" should be identical to "SNES 
> > > > Function norm" (Should I?). This is exactly what I found if I could 
> > > > find a good mat_mffd_err, normally between 10^-3 to 10^-5. So when it 
> > > > happens, the whole algorithm works as expected. When those two norms 
> > > > are off, the inexact Newton becomes very inefficient. For instance, it 
> > > > may take many ksp iterations to converge but the snes norm is only 
> > > > reduced slightly.
> > > > 
> > > > According to the manual, mat_mffd_err should be "square root of 
> > > > relative error in function evaluation". But is there a simple way to 
> > > > estimate it? 
> > > 
> > >   First a related note: there are two different algorithms that PETSc 
> > > provides for computing the factor h using the err parameter. They can be 
> > > set with -mat_mffd_type - wp or ds (see MATMFFD_WP or MATMFFD_DS) some 
> > > people have better luck with one or the other for their problem. 
> > > 
> > > 
> > >   There is some code in PETSc to compute what is called the "noise" of 
> > > the function which in theory leads to a better err value. For example if 
> > > say the last four digits of your function are "noise" (that is 
> > > meaningless stuff) which is possible for complicated multiphysics 
> > > problems  due to round off in the function evaluations then you should 
> > > use an err that is 2 digits bigger than the default (note this is just 
> > > the square root of the "function epsilon" instead of the machine epsilon, 
> > > because the "function epsilon" is much larger than the machine epsilon). 
> > > 
> > >    We never had a great model for how to hook the noise computation code 
> > > up into SNES so it is a bit disconnected, we should revisit this. Perhaps 
> > > you will find it useful and we can figure out together how to hook it in 
> > > cleanly for use. 
> > > 
> > >    The code that computes and reports the noise is in the directory 
> > > src/snes/interface/noise 
> > > 
> > >    You can use this version with the option -snes_mf_version 2 (version 1 
> > > is the normal behavior) 
> > > 
> > >    The code has the ability to print to a file or the screen information 
> > > about the noise it is finding, maybe with command line options, you'll 
> > > have to look directly at the code to see how. 
> > > 
> > >    I don't like the current setup because if you use the noise based 
> > > computation of h you have to use SNESMatrixFreeMult2_Private() which is a 
> > > slightly different routine for doing the actually differencing than the 
> > > two MATMFFD_WP or MATMFFD_DS that are normally used. I don't know why it 
> > > can't use the standard differencing routines but call the noise routine 
> > > to compute h (just requires some minor code refactoring). Also I don't 
> > > see an automatic way to just compute the noise of a function at a bunch 
> > > of points independent of actually using the noise to compute and use h. 
> > > It is sort of there in the routine SNESDiffParameterCompute_More() but 
> > > that routine is not documented or user friendly. 
> > > 
> > >    I would start by rigging up calls to SNESDiffParameterCompute_More() 
> > > to see what it says the noise of your function is, based on your hand 
> > > determined values optimal values for err it should be roughly the square 
> > > of them. Then if this makes sense you can trying using the 
> > > -snes_mf_version 2  code to see if it helps the matrix free multiple 
> > > behave consistently well for your problem by adaptively computing the err 
> > > and using that in computing h. 
> > > 
> > >    Good luck and I'd love to hear back from you if it helps and 
> > > suggestions/code for making this more integrated with SNES so it is 
> > > easier to use. For example perhaps we want a function SNESComputeNoise() 
> > > that uses SNESDiffParameterCompute_More() to compute and report the noise 
> > > for a range of values of u. 
> > > 
> > >    Barry
> > > 
> > > 
> > > 
> > > 
> > > 
> > > > 
> > > > Is there anything else I could possibly tune in this context? 
> > > > 
> > > > The discretization is through mfem and I use standard H1 for my 
> > > > problem. 
> > > > 
> > > > Thanks,
> > > > Qi

Reply via email to