I made a mistake that I turned on -ksp_monitor_true_residual in ~/.petscrc on my cluster. The extra MatMult_MFFD comes from the ksp monitor. Sorry about the confusion.
Thanks again for your help, Barry. We will do more tests based on your suggestion. ________________________________ From: Smith, Barry F. <[email protected]> Sent: Monday, August 19, 2019 10:04 PM To: Tang, Qi <[email protected]> Cc: [email protected] <[email protected]> Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK If I run a "standard" example, such as snes/examples/tutorials/ex19 it does not have this "double" business with fgmres/gmres I think what you are seeing is related to where and how the shell preconditioner is working. By default GMRES uses left preconditioning (you can change it to right with -ksp_pc_side right) while FMGRES has to use right preconditioner (it makes no mathematical sense with left preconditioning). My first guess is if you use GMRES with -ksp_pc_side right you will see the double "business". Right preconditioned GMRES/FGMRES solves A B y= b so an application of the operator is A B but then once y is compute it needs to compute x = B y which is another application of your preconditioner, hence requires another matrix free operation. The h is different for the two multiplies because the a is different. I don't see that this would happen for every KSP iteration, I think there will be one "extra" for each KSP solve. I you truly see two MatMult_MFFD() for each GMRES/FGMRES I would run in the debugger, put a break point in MatMult_MFFD() to see exactly where each one is called. From the manual page MATMFFD_WP - Implements an alternative approach for computing the differencing parameter h used with the finite difference based matrix-free Jacobian. This code implements the strategy of M. Pernice and H. Walker: h = error_rel * sqrt(1 + ||U||) / ||a|| Notes: 1) || U || does not change between linear iterations so is reused 2) In GMRES || a || == 1 and so does not need to ever be computed except at restart when it is recomputed. > On Aug 19, 2019, at 7:51 PM, Tang, Qi <[email protected]> wrote: > > Thanks again, Barry. I am testing more based on your suggestions. > > One thing I do not understand is when I use -mat_mffd_type wp -mat_mffd_err > 1e-3 -ksp_type fgmres. It computes "h" of JFNK twice sequentially in each KSP > iteration. For instance, the output in info looks like > ... > [0] MatMult_MFFD(): Current differencing parameter: 1.953103182078e-09 > [0] MatMult_MFFD(): Current differencing parameter: 6.054449182838e-01 > ... > I found it called MatMult_MFFD twice in a row. Do you happen to know why > petsc calls MatMult_MFFD twice when using fgmre? I also do not understand the > relationship between these two h. They are very off (because apparently ||a|| > is very different). > > On the other hand, h is only computed once when I use gmres. The h there is > clearly scaled with mat_mffd_err. That is easy to understand. However, I > think I need to use fgmres because my preconditioner is not quite linear. > > BTW, my snes view is: > > SNES Object: () 16 MPI processes > type: newtonls > maximum iterations=20, maximum function evaluations=10000 > tolerances: relative=0.0001, absolute=0., solution=0. > total number of linear solver iterations=70 > total number of function evaluations=149 > norm schedule ALWAYS > Eisenstat-Walker computation of KSP relative tolerance (version 3) > rtol_0=0.3, rtol_max=0.9, threshold=0.1 > gamma=0.9, alpha=1.5, alpha2=1.5 > SNESLineSearch Object: () 16 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: () 16 MPI processes > type: fgmres > restart=50, using Classical (unmodified) Gram-Schmidt Orthogonalization > with no iterative refinement > happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=0.0564799, absolute=1e-50, divergence=10000. > right preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object: () 16 MPI processes > type: shell > JFNK preconditioner > No information available on the mfem::Solver > Number of preconditioners created by the factory 8 > linear system matrix followed by preconditioner matrix: > Mat Object: 16 MPI processes > type: mffd > rows=787968, cols=787968 > Matrix-free approximation: > err=0.001 (relative error in function evaluation) > Using wp compute h routine > Does not compute normU > Mat Object: 16 MPI processes > type: nest > rows=787968, cols=787968 > Matrix object: > type=nest, rows=3, cols=3 > MatNest structure: > (0,0) : type=mpiaij, rows=262656, cols=262656 > (0,1) : type=mpiaij, rows=262656, cols=262656 > (0,2) : NULL > (1,0) : type=mpiaij, rows=262656, cols=262656 > (1,1) : type=mpiaij, rows=262656, cols=262656 > (1,2) : NULL > (2,0) : type=mpiaij, rows=262656, cols=262656 > (2,1) : NULL > (2,2) : type=mpiaij, rows=262656, cols=262656 > From: Smith, Barry F. <[email protected]> > Sent: Thursday, August 15, 2019 3:30 AM > To: Tang, Qi <[email protected]> > Cc: [email protected] <[email protected]> > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > On Aug 15, 2019, at 12:36 AM, Tang, Qi <[email protected]> wrote: > > > > Thanks, it works. snes_mf_jorge works for me. > > Great. > > > It appears to compute h in every ksp. > > Each matrix vector product or each KSPSolve()? From the code looks like > each matrix-vector product. > > > > > Without -snes_mf_jorge, it is not working. For some reason, it only > > computes h once, but that h is bad. My gmres residual is not decaying. > > > > > Indeed, the noise in my function becomes larger when I refine the mesh. I > > think it makes sense as I use the same time step for different meshes (that > > is the goal of the preconditioning). However, even when the algorithm is > > working, sqrt(noise) is much less than the good mat_mffd_err I previously > > found (10^-6 vs 10^-3). I do not understand why. > > I have no explanation. The details of the code that computes err are > difficult to trace exactly; would take a while. Perhaps there is some > parameter in there that is too "conservative"? > > > > > Although snes_mf_jorge is working, it is very expensive, as it has to > > evaluate F many times when estimating h. Unfortunately, to achieve the > > nonlinearity, I have to assemble some operators inside my F. There seem no > > easy solutions. > > > > I will try to compute h multiple times without using snes_mf_jorge. But let > > me know if you have other suggestions. Thanks! > > Yes, it seems to me computing the err less often would be a way to make > the code go faster. I looked at the code more closely and noticed a couple of > things. > > if (ctx->jorge) { > ierr = > SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr); > > /* Use the Brown/Saad method to compute h */ > } else { > /* Compute error if desired */ > ierr = SNESGetIterationNumber(snes,&iter);CHKERRQ(ierr); > if ((ctx->need_err) || ((ctx->compute_err_freq) && > (ctx->compute_err_iter != iter) && (!((iter-1)%ctx->compute_err_freq)))) { > /* Use Jorge's method to compute noise */ > ierr = > SNESDiffParameterCompute_More(snes,ctx->data,U,a,&noise,&h);CHKERRQ(ierr); > > ctx->error_rel = PetscSqrtReal(noise); > > ierr = PetscInfo3(snes,"Using Jorge's noise: noise=%g, > sqrt(noise)=%g, > h_more=%g\n",(double)noise,(double)ctx->error_rel,(double)h);CHKERRQ(ierr); > > ctx->compute_err_iter = iter; > ctx->need_err = PETSC_FALSE; > } > > So if jorge is set it uses the jorge algorithm for each matrix multiple to > compute a new err and h. If jorge is not set but -snes_mf_compute_err is set > then it computes a new err and h depending on the parameters (you can run > with -info and grep for noise to see the PetscInfo lines printed (maybe add > iter to the output to see how often it is recomputed) > > if ((ctx->need_err) || ((ctx->compute_err_freq) && (ctx->compute_err_iter != > iter) && (!((iter-1)%ctx->compute_err_freq)))) { > > so the compute_err_freq determines when the err and h are recomputed. The > logic is a bit strange so I cannot decipher exactly how often it is > recomputing. I'm guess at the first Newton iteration and then some number of > iterations later. > > You could try to rig it so that it is at every new Newton step (this would > mean when computing f(x + h a) - f(x) for each new x it will recompute I > think). > > > A more "research type" approach to try to reduce the work to a reasonable > level would be to keep the same err until "things start to go bad" and then > recompute it. But how to measure "when things start to go bad?" It is related > to GMRES stagnating, so you could for example track the convergence of GMRES > and if it is less than expected, kill the linear solve, reset the computation > of err and then start the KSP at the same point as before. But this could be > terribly expensive since all the steps in the most recent KSP are lost. > > Another possibility is to assume that the err is valid in a certain size > ball around the current x. When x + ha or the new x is outside that ball then > recompute the err. But how to choose the ball size and is there a way to > adjust the ball size depending on how the computations proceed. For example > if everything is going great you could slowly increase the size of the ball > and hope for the best; but how to detect if the ball got too big (as above > but terribly expensive)? > > Track the size of the err each time it is computed. If it stays about the > same for a small number of times just freeze it for a while at that value? > But again how to determine when it is no longer good? > > Just a few wild thoughts, I really have no solid ideas on how to reduce the > work requirements. > > Barry > > > > > > > > > > Qi > > > > > > From: Smith, Barry F. <[email protected]> > > Sent: Wednesday, August 14, 2019 10:57 PM > > To: Tang, Qi <[email protected]> > > Cc: [email protected] <[email protected]> > > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > > > > > On Aug 14, 2019, at 9:41 PM, Tang, Qi <[email protected]> wrote: > > > > > > Thanks for the help, Barry. I tired both ds and wp, and again it depends > > > on if I could find the correct parameter set. It is getting harder as I > > > refine the mesh. > > > > > > So I try to use SNESDefaultMatrixFreeCreate2, SNESMatrixFreeMult2_Private > > > or SNESDiffParameterCompute_More in mfem. But it looks like these > > > functions are not in linked in petscsnes.h. How could I call them? > > > > They may not be listed in petscsnes.h but I think they should be in the > > library. You can just stick the prototypes for the functions anywhere you > > need them for now. > > > > > > You should be able to use > > > > ierr = PetscOptionsInt("-snes_mf_version","Matrix-Free routines version 1 > > or 2","None",snes->mf_version,&snes->mf_version,0);CHKERRQ(ierr); > > > > then this info is passed with > > > > if (snes->mf) { > > ierr = SNESSetUpMatrixFree_Private(snes, snes->mf_operator, > > snes->mf_version);CHKERRQ(ierr); > > } > > > > this routine has > > > > if (version == 1) { > > ierr = MatCreateSNESMF(snes,&J);CHKERRQ(ierr); > > ierr = > > MatMFFDSetOptionsPrefix(J,((PetscObject)snes)->prefix);CHKERRQ(ierr); > > ierr = MatSetFromOptions(J);CHKERRQ(ierr); > > } else if (version == 2) { > > if (!snes->vec_func) > > SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_WRONGSTATE,"SNESSetFunction() must be > > called first"); > > #if !defined(PETSC_USE_COMPLEX) && !defined(PETSC_USE_REAL_SINGLE) && > > !defined(PETSC_USE_REAL___FLOAT128) && !defined(PETSC_USE_REAL___FP16) > > ierr = > > SNESDefaultMatrixFreeCreate2(snes,snes->vec_func,&J);CHKERRQ(ierr); > > #else > > > > and this routine has > > > > ierr = VecDuplicate(x,&mfctx->w);CHKERRQ(ierr); > > ierr = PetscObjectGetComm((PetscObject)x,&comm);CHKERRQ(ierr); > > ierr = VecGetSize(x,&n);CHKERRQ(ierr); > > ierr = VecGetLocalSize(x,&nloc);CHKERRQ(ierr); > > ierr = MatCreate(comm,J);CHKERRQ(ierr); > > ierr = MatSetSizes(*J,nloc,n,n,n);CHKERRQ(ierr); > > ierr = MatSetType(*J,MATSHELL);CHKERRQ(ierr); > > ierr = MatShellSetContext(*J,mfctx);CHKERRQ(ierr); > > ierr = MatShellSetOperation(*J,MATOP_MULT,(void > > (*)(void))SNESMatrixFreeMult2_Private);CHKERRQ(ierr); > > ierr = MatShellSetOperation(*J,MATOP_DESTROY,(void > > (*)(void))SNESMatrixFreeDestroy2_Private);CHKERRQ(ierr); > > ierr = MatShellSetOperation(*J,MATOP_VIEW,(void > > (*)(void))SNESMatrixFreeView2_Private);CHKERRQ(ierr); > > ierr = MatSetUp(*J);CHKERRQ(ierr); > > > > > > > > > > > > > > > > > > Also, could I call SNESDiffParameterCompute_More in snes_monitor? But it > > > needs "a" in (F(u + ha) - F(u)) /h as the input. So I am not sure how I > > > could properly use it. Maybe use SNESDefaultMatrixFreeCreate2 inside > > > SNESSetJacobian would be easier to try? > > > > If you use the flag -snes_mf_noise_file filename when it runs it will save > > all the noise information it computes along the way to that file (yes it is > > crude and doesn't match the PETSc Viewer/monitor style but it should work). > > > > Thus I think you can use it and get all the possible monitoring > > information without actually writing any code. Just > > > > -snes_mf_version 2 > > -snes_mf_noise_file filename > > > > > > > > Barry > > > > > > > > Thanks again, > > > Qi > > > > > > From: Smith, Barry F. <[email protected]> > > > Sent: Tuesday, August 13, 2019 9:07 PM > > > To: Tang, Qi <[email protected]> > > > Cc: [email protected] <[email protected]> > > > Subject: Re: [petsc-users] How to choose mat_mffd_err in JFNK > > > > > > > > > > > > > On Aug 13, 2019, at 7:27 PM, Tang, Qi via petsc-users > > > > <[email protected]> wrote: > > > > > > > > Hi, > > > > I am using JFNK, inexact Newton and a shell "physics-based" > > > > preconditioning to solve some multiphysics problems. I have been > > > > playing with mat_mffd_err, and it gives me some results I do not fully > > > > understand. > > > > > > > > I believe the default value of mat_mffd_err is 10^-8 for double > > > > precision, which seem too large for my problems. When I test a problem > > > > essentially still in the linear regime, I expect my converged > > > > "unpreconditioned resid norm of KSP" should be identical to "SNES > > > > Function norm" (Should I?). This is exactly what I found if I could > > > > find a good mat_mffd_err, normally between 10^-3 to 10^-5. So when it > > > > happens, the whole algorithm works as expected. When those two norms > > > > are off, the inexact Newton becomes very inefficient. For instance, it > > > > may take many ksp iterations to converge but the snes norm is only > > > > reduced slightly. > > > > > > > > According to the manual, mat_mffd_err should be "square root of > > > > relative error in function evaluation". But is there a simple way to > > > > estimate it? > > > > > > First a related note: there are two different algorithms that PETSc > > > provides for computing the factor h using the err parameter. They can be > > > set with -mat_mffd_type - wp or ds (see MATMFFD_WP or MATMFFD_DS) some > > > people have better luck with one or the other for their problem. > > > > > > > > > There is some code in PETSc to compute what is called the "noise" of > > > the function which in theory leads to a better err value. For example if > > > say the last four digits of your function are "noise" (that is > > > meaningless stuff) which is possible for complicated multiphysics > > > problems due to round off in the function evaluations then you should > > > use an err that is 2 digits bigger than the default (note this is just > > > the square root of the "function epsilon" instead of the machine epsilon, > > > because the "function epsilon" is much larger than the machine epsilon). > > > > > > We never had a great model for how to hook the noise computation code > > > up into SNES so it is a bit disconnected, we should revisit this. Perhaps > > > you will find it useful and we can figure out together how to hook it in > > > cleanly for use. > > > > > > The code that computes and reports the noise is in the directory > > > src/snes/interface/noise > > > > > > You can use this version with the option -snes_mf_version 2 (version 1 > > > is the normal behavior) > > > > > > The code has the ability to print to a file or the screen information > > > about the noise it is finding, maybe with command line options, you'll > > > have to look directly at the code to see how. > > > > > > I don't like the current setup because if you use the noise based > > > computation of h you have to use SNESMatrixFreeMult2_Private() which is a > > > slightly different routine for doing the actually differencing than the > > > two MATMFFD_WP or MATMFFD_DS that are normally used. I don't know why it > > > can't use the standard differencing routines but call the noise routine > > > to compute h (just requires some minor code refactoring). Also I don't > > > see an automatic way to just compute the noise of a function at a bunch > > > of points independent of actually using the noise to compute and use h. > > > It is sort of there in the routine SNESDiffParameterCompute_More() but > > > that routine is not documented or user friendly. > > > > > > I would start by rigging up calls to SNESDiffParameterCompute_More() > > > to see what it says the noise of your function is, based on your hand > > > determined values optimal values for err it should be roughly the square > > > of them. Then if this makes sense you can trying using the > > > -snes_mf_version 2 code to see if it helps the matrix free multiple > > > behave consistently well for your problem by adaptively computing the err > > > and using that in computing h. > > > > > > Good luck and I'd love to hear back from you if it helps and > > > suggestions/code for making this more integrated with SNES so it is > > > easier to use. For example perhaps we want a function SNESComputeNoise() > > > that uses SNESDiffParameterCompute_More() to compute and report the noise > > > for a range of values of u. > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > > Is there anything else I could possibly tune in this context? > > > > > > > > The discretization is through mfem and I use standard H1 for my problem. > > > > > > > > Thanks, > > > > Qi
