> On May 12, 2021, at 12:40 PM, feng wang <[email protected]> wrote: > > "Is the the plot for KSP convergence with ASM preconditioning? Or just > matrix-free multiply and no preconditioner? " > > The plot shows the non-linear residual, which is the L2 norm of my right hand > side, against number of non-linear iterations. I am using GMRES > preconditioned with ASM. From your comments, If I understand it correctly, > my non-linear residual history would change with number of cpus which I use > because of ASM?
That is fine. The ASM affects the linear convergence history but should not normally affect the nonlinear residual history much at all. > > "One normally expects the number of iterations to increase with more cores > for ASM (especially from one to two ranks) because there is less coupling in > the preconditioner the more blocks (ranks) you use". > somehow I did not see a big difference in my non-linear residual when I use > 1, 2, 3 cpus in this case. I can try a different case and check. > > Thanks, > Feng > > From: Barry Smith <[email protected] <mailto:[email protected]>> > Sent: 12 May 2021 17:11 > To: feng wang <[email protected] <mailto:[email protected]>> > Cc: [email protected] <mailto:[email protected]> > <[email protected] <mailto:[email protected]>> > Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation > > > Is the the plot for KSP convergence with ASM preconditioning? Or just > matrix-free multiply and no preconditioner? > > One normally expects the number of iterations to increase with more cores > for ASM (especially from one to two ranks) because there is less coupling in > the preconditioner the more blocks (ranks) you use. > > Barry > > >> On May 12, 2021, at 7:06 AM, feng wang <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hi Barry, >> >> I have implemented my matrix-free GMRES in parallel. There is a small >> difference in the residuals (please see the figure below) when I vary the >> number of cores. But they all converged in the end. I have tried to multiply >> my shell matrix or preconditoning matrix with a vector in parallel, I got >> same values as a serial run. so I believe halo exchange is ok for >> matrix-vector multiplications. >> >> I am using ASM preconditioner with overlap set to 1. Is this behaviour in >> parallel normal for ASM pre-conditioners? If it is not, then I know I need >> to find a bug in my code. >> >> Thanks, >> Feng >> >> <tmp.png> >> >> From: petsc-users <[email protected] >> <mailto:[email protected]>> on behalf of feng wang >> <[email protected] <mailto:[email protected]>> >> Sent: 28 March 2021 22:05 >> To: Barry Smith <[email protected] <mailto:[email protected]>> >> Cc: [email protected] <mailto:[email protected]> >> <[email protected] <mailto:[email protected]>> >> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >> >> Hi Barry, >> Thanks for your comments. I will try that. >> Thanks, >> Feng >> >> From: Barry Smith <[email protected] <mailto:[email protected]>> >> Sent: 26 March 2021 23:44 >> To: feng wang <[email protected] <mailto:[email protected]>> >> Cc: [email protected] <mailto:[email protected]> >> <[email protected] <mailto:[email protected]>> >> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >> >> >> >>> On Mar 25, 2021, at 6:39 PM, feng wang <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Hi Barry, >>> >>> Thanks for your comments. >>> >>> I will renumber the cells in the way as you recommended. I went through the >>> manual again and understand how to update the halo elements for my shell >>> matrix routine "mymult(Mat m ,Vec x, Vec y)". I can use the global index of >>> ghost cells for each rank and "Vec x" to get the ghost values for each rank >>> via scattering. It should be similar to the example in page 40 in the >>> manual. >>> >>> One more question, I also have an assembled approximate Jacobian matrix for >>> pre-conditioning GMRES. If I re-number the cells properly as your >>> suggested, I don't need to worry about communication and petsc will handle >>> it properly together with my shell-matrix? >> >> If you assembly the approximate Jaocobian using the "new" ordering then >> it will reflect the same function evaluation and matrix free operators so >> should be ok. >> >> Barry >> >>> >>> Thanks, >>> Feng >>> >>> From: Barry Smith <[email protected] <mailto:[email protected]>> >>> Sent: 25 March 2021 0:03 >>> To: feng wang <[email protected] <mailto:[email protected]>> >>> Cc: [email protected] <mailto:[email protected]> >>> <[email protected] <mailto:[email protected]>> >>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>> >>> >>> >>>> On Mar 24, 2021, at 5:08 AM, feng wang <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> Hi Barry, >>>> >>>> Thanks for your comments. It's very helpful. For your comments, I have a >>>> bit more questions >>>> >>>> for your 1st comment " Yes, in some sense. So long as each process ....". >>>> If I understand it correctly (hopefully) a parallel vector in petsc can >>>> hold discontinuous rows of data in a global array. If this is true, If I >>>> call "VecGetArray", it would create a copy in a continuous space if the >>>> data is not continuous, do some operations and petsc will figure out how >>>> to put updated values back to the right place in the global array? >>>> This would generate an overhead. If I do the renumbering to make each >>>> process hold continuous rows, this overhead can be avoided when I call >>>> "VecGetArray"? >>> >>> GetArray does nothing except return the pointer to the data in the >>> vector. It does not copy anything or reorder anything. Whatever order the >>> numbers are in vector they are in the same order as in the array you obtain >>> with VecGetArray. >>> >>>> for your 2nd comment " The matrix and vectors the algebraic solvers see DO >>>> NOT have......." For the callback function of my shell matrix "mymult(Mat >>>> m ,Vec x, Vec y)", I need to get "x" for the halo elements to compute the >>>> non-linear function. My code will take care of other halo exchanges, but I >>>> am not sure how to use petsc to get the halo elements "x" in the shell >>>> matrix, could you please elaborate on this? some related examples or >>>> simple pesudo code would be great. >>> Basically all the parallel code in PETSc does this. How you need to set >>> up the halo communication depends on how you are managing the assignment of >>> degrees of freedom on each process and between processes. >>> VecScatterCreate() is the tool you will use to tell PETSc how to get the >>> correct values from one process to their halo-ed location on the process. >>> It like everything in PETSc uses a number in the vectors of 0 ... n_0-1 on >>> the first process, n_0, n_0+1, ... n_1-1 on the second etc. Since you are >>> managing the partitioning and distribution of parallel data you must >>> renumber the vector entry numbering in your data structures to match that >>> shown above. Just do the numbering once after you have setup your >>> distributed data and use it for the rest of the run. You might use the >>> object from AOCreate to do the renumbering for you. >>> >>> Barry >>> >>> >>> >>>> Thanks, >>>> Feng >>>> >>>> From: Barry Smith <[email protected] <mailto:[email protected]>> >>>> Sent: 22 March 2021 1:28 >>>> To: feng wang <[email protected] <mailto:[email protected]>> >>>> Cc: [email protected] <mailto:[email protected]> >>>> <[email protected] <mailto:[email protected]>> >>>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>>> >>>> >>>> >>>>> On Mar 21, 2021, at 6:22 PM, feng wang <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> Hi Barry, >>>>> >>>>> Thanks for your help, I really appreciate it. >>>>> >>>>> In the end I used a shell matrix to compute the matrix-vector product, it >>>>> is clearer to me and there are more things under my control. I am now >>>>> trying to do a parallel implementation, I have some questions on setting >>>>> up parallel matrices and vectors for a user-defined partition, could you >>>>> please provide some advice? Suppose I have already got a partition for 2 >>>>> CPUs. Each cpu is assigned a list of elements and also their halo >>>>> elements. >>>>> The global element index for each partition is not necessarily >>>>> continuous, do I have to I re-order them to make them continuous? >>>> >>>> Yes, in some sense. So long as each process can march over ITS elements >>>> computing the function and Jacobian matrix-vector product it doesn't >>>> matter how you have named/numbered entries. But conceptually the first >>>> process has the first set of vector entries and the second the second set. >>>>> >>>>> When I set up the size of the matrix and vectors for each cpu, should I >>>>> take into account the halo elements? >>>> >>>> The matrix and vectors the algebraic solvers see DO NOT have halo >>>> elements in their sizes. You will likely need a halo-ed work vector to do >>>> the matrix-free multiply from. The standard model is use >>>> VecScatterBegin/End to get the values from the non-halo-ed algebraic >>>> vector input to MatMult into a halo-ed one to do the local product. >>>> >>>>> In my serial version, when I initialize my RHS vector, I am not using >>>>> VecSetValues, Instead I use VecGetArray/VecRestoreArray to assign the >>>>> values. VecAssemblyBegin()/VecAssemblyEnd() is never used. would this >>>>> still work for a parallel version? >>>> >>>> Yes, you can use Get/Restore but the input vector x will need to be, as >>>> noted above, scattered into a haloed version to get all the entries you >>>> will need to do the local part of the product. >>>> >>>> >>>>> Thanks, >>>>> Feng >>>>> >>>>> >>>>> From: Barry Smith <[email protected] <mailto:[email protected]>> >>>>> Sent: 12 March 2021 23:40 >>>>> To: feng wang <[email protected] <mailto:[email protected]>> >>>>> Cc: [email protected] <mailto:[email protected]> >>>>> <[email protected] <mailto:[email protected]>> >>>>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>>>> >>>>> >>>>> >>>>>> On Mar 12, 2021, at 9:37 AM, feng wang <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>> >>>>>> Hi Matt, >>>>>> >>>>>> Thanks for your prompt response. >>>>>> >>>>>> Below are my two versions. one is buggy and the 2nd one is working. For >>>>>> the first one, I add the diagonal contribution to the true RHS >>>>>> (variable: rhs) and then set the base point, the callback function is >>>>>> somehow called twice afterwards to compute Jacobian. >>>>> >>>>> Do you mean "to compute the Jacobian matrix-vector product?" >>>>> >>>>> Is it only in the first computation of the product (for the given >>>>> base vector) that it calls it twice or every matrix-vector product? >>>>> >>>>> It is possible there is a bug in our logic; run in the debugger with >>>>> a break point in FormFunction_mf and each time the function is hit in the >>>>> debugger type where or bt to get the stack frames from the calls. Send >>>>> this. From this we can all see if it is being called excessively and why. >>>>> >>>>>> For the 2nd one, I just call the callback function manually to recompute >>>>>> everything, the callback function is then called once as expected to >>>>>> compute the Jacobian. For me, both versions should do the same things. >>>>>> but I don't know why in the first one the callback function is called >>>>>> twice after I set the base point. what could possibly go wrong? >>>>> >>>>> The logic of how it is suppose to work is shown below. >>>>>> >>>>>> Thanks, >>>>>> Feng >>>>>> >>>>>> //This does not work >>>>>> fld->cnsv( iqs,iqe, q, aux, csv ); >>>>>> //add contribution of time-stepping >>>>>> for(iv=0; iv<nv; iv++) >>>>>> { >>>>>> for(iq=0; iq<nq; iq++) >>>>>> { >>>>>> //use conservative variables here >>>>>> rhs[iv][iq] = -rhs[iv][iq] + >>>>>> csv[iv][iq]*lhsa[nlhs-1][iq]/cfl; >>>>>> } >>>>>> } >>>>>> ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); >>>>>> ierr = petsc_setrhs(petsc_baserhs); CHKERRQ(ierr); >>>>>> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); >>>>>> CHKERRQ(ierr); >>>>>> >>>>>> //This works >>>>>> fld->cnsv( iqs,iqe, q, aux, csv ); >>>>>> ierr = petsc_setcsv(petsc_csv); CHKERRQ(ierr); >>>>>> ierr = FormFunction_mf(this, petsc_csv, petsc_baserhs); //this >>>>>> is my callback function, now call it manually >>>>>> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); >>>>>> CHKERRQ(ierr); >>>>>> >>>>>> >>>>> Since you provide petsc_baserhs MatMFFD assumes (naturally) that you >>>>> will keep the correct values in it. Hence for each new base value YOU >>>>> need to compute the new values in petsc_baserhs. This approach gives you >>>>> a bit more control over reusing the information in petsc_baserhs. >>>>> >>>>> If you would prefer that MatMFFD recomputes the base values, as >>>>> needed, then you call FormFunction_mf(this, petsc_csv, NULL); and PETSc >>>>> will allocate a vector and fill it up as needed by calling your >>>>> FormFunction_mf() But you need to call MatAssemblyBegin/End each time you >>>>> the base input vector this, petsc_csv values change. For example >>>>> >>>>> MatAssemblyBegin(petsc_A_mf,...) >>>>> MatAssemblyEnd(petsc_A_mf,...) >>>>> KSPSolve() >>>>> >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> From: Matthew Knepley <[email protected] <mailto:[email protected]>> >>>>>> Sent: 12 March 2021 15:08 >>>>>> To: feng wang <[email protected] <mailto:[email protected]>> >>>>>> Cc: Barry Smith <[email protected] <mailto:[email protected]>>; >>>>>> [email protected] <mailto:[email protected]> >>>>>> <[email protected] <mailto:[email protected]>> >>>>>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>>>>> >>>>>> On Fri, Mar 12, 2021 at 9:55 AM feng wang <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>> Hi Mat, >>>>>> >>>>>> Thanks for your reply. I will try the parallel implementation. >>>>>> >>>>>> I've got a serial matrix-free GMRES working, but I would like to know >>>>>> why my initial version of matrix-free implementation does not work and >>>>>> there is still something I don't understand. I did some debugging and >>>>>> find that the callback function to compute the RHS for the matrix-free >>>>>> matrix is called twice by Petsc when it computes the finite difference >>>>>> Jacobian, but it should only be called once. I don't know why, could you >>>>>> please give some advice? >>>>>> >>>>>> F is called once to calculate the base point and once to get the >>>>>> perturbation. The base point is not recalculated, so if you do many >>>>>> iterates, it is amortized. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> Thanks, >>>>>> Feng >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> From: Matthew Knepley <[email protected] <mailto:[email protected]>> >>>>>> Sent: 12 March 2021 12:05 >>>>>> To: feng wang <[email protected] <mailto:[email protected]>> >>>>>> Cc: Barry Smith <[email protected] <mailto:[email protected]>>; >>>>>> [email protected] <mailto:[email protected]> >>>>>> <[email protected] <mailto:[email protected]>> >>>>>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>>>>> >>>>>> On Fri, Mar 12, 2021 at 6:02 AM feng wang <[email protected] >>>>>> <mailto:[email protected]>> wrote: >>>>>> Hi Barry, >>>>>> >>>>>> Thanks for your advice. >>>>>> >>>>>> You are right on this. somehow there is some inconsistency when I >>>>>> compute the right hand side (true RHS + time-stepping contribution to >>>>>> the diagonal matrix) to compute the finite difference Jacobian. If I >>>>>> just use the call back function to recompute my RHS before I call >>>>>> MatMFFDSetBase, then it works like a charm. But now I end up with >>>>>> computing my RHS three times. 1st time is to compute the true RHS, the >>>>>> rest two is for computing finite difference Jacobian. >>>>>> >>>>>> In my previous buggy version, I only compute RHS twice. If possible, >>>>>> could you elaborate on your comments "Also be careful about >>>>>> petsc_baserhs", so I may possibly understand what was going on with my >>>>>> buggy version. >>>>>> >>>>>> Our FD implementation is simple. It approximates the action of the >>>>>> Jacobian as >>>>>> >>>>>> J(b) v = (F(b + h v) - F(b)) / h ||v|| >>>>>> >>>>>> where h is some small parameter and b is the base vector, namely the one >>>>>> that you are linearizing around. In a Newton step, b is the previous >>>>>> solution >>>>>> and v is the proposed solution update. >>>>>> >>>>>> Besides, for a parallel implementation, my code already has its own >>>>>> partition method, is it possible to allow petsc read in a user-defined >>>>>> partition? if not what is a better way to do this? >>>>>> >>>>>> Sure >>>>>> >>>>>> >>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html >>>>>> >>>>>> <https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecSetSizes.html> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> Many thanks, >>>>>> Feng >>>>>> >>>>>> >>>>>> From: Barry Smith <[email protected] <mailto:[email protected]>> >>>>>> Sent: 11 March 2021 22:15 >>>>>> To: feng wang <[email protected] <mailto:[email protected]>> >>>>>> Cc: [email protected] <mailto:[email protected]> >>>>>> <[email protected] <mailto:[email protected]>> >>>>>> Subject: Re: [petsc-users] Questions on matrix-free GMRES implementation >>>>>> >>>>>> >>>>>> Feng, >>>>>> >>>>>> The first thing to check is that for each linear solve that >>>>>> involves a new operator (values in the base vector) the MFFD matrix >>>>>> knows it is using a new operator. >>>>>> >>>>>> The easiest way is to call MatMFFDSetBase() before each solve that >>>>>> involves a new operator (new values in the base vector). Also be >>>>>> careful about petsc_baserhs, when you change the base vector's values >>>>>> you also need to change the petsc_baserhs values to the function >>>>>> evaluation at that point. >>>>>> >>>>>> If that is correct I would check with a trivial function evaluator to >>>>>> make sure the infrastructure is all set up correctly. For examples use >>>>>> for the matrix free a 1 4 1 operator applied matrix free. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>>> On Mar 11, 2021, at 7:35 AM, feng wang <[email protected] >>>>>>> <mailto:[email protected]>> wrote: >>>>>>> >>>>>>> Dear All, >>>>>>> >>>>>>> I am new to petsc and trying to implement a matrix-free GMRES. I have >>>>>>> assembled an approximate Jacobian matrix just for preconditioning. >>>>>>> After reading some previous questions on this topic, my approach is: >>>>>>> >>>>>>> the matrix-free matrix is created as: >>>>>>> >>>>>>> ierr = MatCreateMFFD(*A_COMM_WORLD, iqe*blocksize, iqe*blocksize, >>>>>>> PETSC_DETERMINE, PETSC_DETERMINE, &petsc_A_mf); CHKERRQ(ierr); >>>>>>> ierr = MatMFFDSetFunction(petsc_A_mf, FormFunction_mf, this); >>>>>>> CHKERRQ(ierr); >>>>>>> >>>>>>> KSP linear operator is set up as: >>>>>>> >>>>>>> ierr = KSPSetOperators(petsc_ksp, petsc_A_mf, petsc_A_pre); >>>>>>> CHKERRQ(ierr); //petsc_A_pre is my assembled pre-conditioning matrix >>>>>>> >>>>>>> Before calling KSPSolve, I do: >>>>>>> >>>>>>> ierr = MatMFFDSetBase(petsc_A_mf, petsc_csv, petsc_baserhs); >>>>>>> CHKERRQ(ierr); //petsc_csv is the flow states, petsc_baserhs is the >>>>>>> pre-computed right hand side >>>>>>> >>>>>>> The call back function is defined as: >>>>>>> >>>>>>> PetscErrorCode cFdDomain::FormFunction_mf(void *ctx, Vec in_vec, Vec >>>>>>> out_vec) >>>>>>> { >>>>>>> PetscErrorCode ierr; >>>>>>> cFdDomain *user_ctx; >>>>>>> >>>>>>> cout << "FormFunction_mf called\n"; >>>>>>> >>>>>>> //in_vec: flow states >>>>>>> //out_vec: right hand side + diagonal contributions from CFL >>>>>>> number >>>>>>> >>>>>>> user_ctx = (cFdDomain*)ctx; >>>>>>> >>>>>>> //get perturbed conservative variables from petsc >>>>>>> user_ctx->petsc_getcsv(in_vec); >>>>>>> >>>>>>> //get new right side >>>>>>> user_ctx->petsc_fd_rhs(); >>>>>>> >>>>>>> //set new right hand side to the output vector >>>>>>> user_ctx->petsc_setrhs(out_vec); >>>>>>> >>>>>>> ierr = 0; >>>>>>> return ierr; >>>>>>> } >>>>>>> >>>>>>> The linear system I am solving is (J+D)x=RHS. J is the Jacobian matrix. >>>>>>> D is a diagonal matrix and it is used to stabilise the solution at the >>>>>>> start but reduced gradually when the solution moves on to recover >>>>>>> Newton's method. I add D*x to the true right side when non-linear >>>>>>> function is computed to work out finite difference Jacobian, so when >>>>>>> finite difference is used, it actually computes (J+D)*dx. >>>>>>> >>>>>>> The code runs but diverges in the end. If I don't do matrix-free and >>>>>>> use my approximate Jacobian matrix, GMRES works. So something is wrong >>>>>>> with my matrix-free implementation. Have I missed something in my >>>>>>> implementation? Besides, is there a way to check if the finite >>>>>>> difference Jacobian matrix is computed correctly in a matrix-free >>>>>>> implementation? >>>>>>> >>>>>>> Thanks for your help in advance. >>>>>>> Feng >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which >>>>>> their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> <http://www.cse.buffalo.edu/~knepley/> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which >>>>>> their experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> https://www.cse.buffalo.edu/~knepley/ >>>>>> <http://www.cse.buffalo.edu/~knepley/>
