On Apr 4, 2012, at 3:21 PM, Yuqi Wu wrote: > I have three SYMBOLIC factorization and three SYMBOLIC factorization done in > my program. > > I didn't use the SAME_NONZERO_PATTERN is provided in the compute Jacobian > routine. If I use the flag then it will be two SYMBOLIC factorization and > three SYMBOLIC factorization done in my program.
2 SYMB and 3 (or it seems 4 to me) NUMERIC factorizations. Mark > > For -info output > I got none message with "Setting PC with identical preconditioner" > > I got five message with "Setting up new PC" > > I got none message with "Setting up PC with same nonzero pattern" > > I got three message with "Setting up PC with different nonzero pattern" > > > For messages with "Setting up new PC", I got the following output > > [0] PCSetUp(): Setting up new PC > [0] PCSetUp_MG(): Using outer operators to define finest grid operator > because PCMGGetSmoother(pc,nlevels-1,&ksp);KSPSetOperators(ksp,...); was not > called. > [0] MatGetSymbolicTranspose_SeqAIJ(): Getting Symbolic Transpose. > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 > unneeded,656174 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410 > [0] Mat_CheckInode(): Found 1794 nodes of 4186. Limit used: 5. Using Inode > routines > [0] MatRestoreSymbolicTranspose_SeqAIJ(): Restoring Symbolic Transpose. > [0] MatPtAPSymbolic_SeqAIJ_SeqAIJ(): Reallocs 1; Fill ratio: given 1 needed > 1.43239. > [0] MatPtAPSymbolic_SeqAIJ_SeqAIJ(): Use MatPtAP(A,P,MatReuse,1.43239,&C) for > best performance. > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 > unneeded,656174 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410 > [0] PCSetUp(): Setting up new PC > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] VecScatterCreate(): Special case: sequential vector general to stride > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11585 X 11585; storage space: 0 > unneeded,458097 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 168 > [0] Mat_CheckInode(): Found 8271 nodes of 11585. Limit used: 5. Using Inode > routines > [0] PCSetUp(): Setting up new PC > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] VecScatterCreate(): Special case: sequential vector general to stride > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 > unneeded,656174 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410 > [0] Mat_CheckInode(): Found 1794 nodes of 4186. Limit used: 5. Using Inode > routines > 0 KSP Residual norm 1.014990964599e+02 > [0] PCSetUp(): Setting up new PC > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 2 Fill ratio:given 5 needed 11.401 > [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 11.401 or use > [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,11.401); > [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. > [0] Mat_CheckInode_FactorLU(): Found 8057 nodes of 11585. Limit used: 5. > Using Inode routines > [0] KSPDefaultConverged(): Linear solver has converged. Residual norm > 5.755920981112e-13 is less than relative tolerance 1.000000000000e-05 times > initial right hand side norm 5.479558824115e+02 at iteration 1 > [0] PCSetUp(): Setting up new PC > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed 7.07175 > [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use > [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175); > [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. > [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. Using > Inode routines > Residual norms for coarse_ solve. > 0 KSP Residual norm 5.698312810532e-16 > > For the message with "Setting up PC with different nonzero pattern", I got > the following outputs > > [0] PCSetUp(): Setting up PC with different nonzero pattern > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 > unneeded,656174 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410 > [0] PCSetUp(): Setting up PC with different nonzero pattern > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 > unneeded,656174 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410 > [0] Mat_CheckInode(): Found 1794 nodes of 4186. Limit used: 5. Using Inode > routines > 0 KSP Residual norm 9.922628060272e-05 > [0] PCSetUp(): Setting up PC with different nonzero pattern > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 > -2080374783 > [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed 7.07175 > [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use > [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175); > [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. > [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. Using > Inode routines > > Thank you so much for your help. > > Yuqi > > >> Setting up new PC >> Setting up PC with same nonzero pattern\ >> Setting up PC with different nonzero pattern\n > > > ---- Original message ---- >> Date: Wed, 4 Apr 2012 13:55:55 -0500 >> From: petsc-users-bounces at mcs.anl.gov (on behalf of Barry Smith <bsmith >> at mcs.anl.gov>) >> Subject: Re: [petsc-users] Questions about PCMG >> To: PETSc users list <petsc-users at mcs.anl.gov> >> >> >> Note: In most applications the flag SAME_NONZERO_PATTERN is provided in the >> compute Jacobian routine, this means that the SYMBOLIC factorization needs >> to be only done ONCE per matrix; only the numeric factorization needs to be >> done when the nonzero values have changed (the symbolic need not be >> repeated). Are you using this flag? How many times in the NUMERIC >> factorization being done? >> >> When you run the program with -info it will print information of the form: >> (run on one process to make life simple) >> >> Setting PC with identical preconditioner\ >> Setting up new PC >> Setting up PC with same nonzero pattern\ >> Setting up PC with different nonzero pattern\n >> >> How many, and exactly what messages of this form are you getting? >> >> When all else fails you can run the program in the debugger to track what >> is happening and why. >> >> Put a breakpoint in PCSetUp() then each time it gets called use next to step >> through it to see what is happening. >> >> First thing to check, is PCSetUp() getting called on each level for each >> new SNES iteration? >> >> Second thing, if it is then why is it not triggering the new numerical >> factorization. >> >> >> Barry >> >> On Apr 4, 2012, at 1:34 PM, Yuqi Wu wrote: >> >>> Thanks, Adam. >>> >>> Yes. I am using the Galerkin coarse grids. But I am not sure whether the >>> coarse grid is not getting refactored in the second SNES solve or the fine >>> grid smoother is not getting refactored in the second SNES solve. >>> >>> In the -info output attached in the previous email, the fine grid matrix is >>> of size 11585 by 11585, and the coarse grid matrix is of size 4186 by 4186. >>> In the -info output, I found out three MatLUFactorSymbolic_SeqAIJ routines, >>> one for fine martrix, and two for coarse matrix. >>> >>> >>> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 2 Fill ratio:given 5 needed >>> 11.401 >>> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 11.401 or use >>> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,11.401); >>> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. >>> [0] Mat_CheckInode_FactorLU(): Found 8057 nodes of 11585. Limit used: 5. >>> Using Inode routines >>> >>> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed >>> 7.07175 >>> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use >>> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175); >>> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. >>> [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. >>> Using Inode routines >>> >>> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed >>> 7.07175 >>> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use >>> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175); >>> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. >>> [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. >>> Using Inode routines >>> >>> So I believe that the fine grid smoother is not getting refactored in the >>> second SNES solve. >>> >>> Best >>> >>> Yuqi >>> >>> >>> >>> ---- Original message ---- >>>> Date: Wed, 4 Apr 2012 14:24:28 -0400 >>>> From: petsc-users-bounces at mcs.anl.gov (on behalf of "Mark F. Adams" >>>> <mark.adams at columbia.edu>) >>>> Subject: Re: [petsc-users] Questions about PCMG >>>> To: PETSc users list <petsc-users at mcs.anl.gov> >>>> >>>> I would expect 4 calls to MatLUFactorSym here. It looks like the coarse >>>> grid is not getting refactored in the second SNES solve. >>>> >>>> Are you using Galerkin coarse grids? Perhaps you are not setting a new >>>> coarse grid with KSPSetOperator and so MG does not bother refactoring it. >>>> >>>> Mark >>>> >>>> On Apr 4, 2012, at 1:53 PM, Yuqi Wu wrote: >>>> >>>>> Thank you. >>>>> >>>>> Can I ask another question? >>>>> >>>>> In my log summary output, it shows that although there are two SNES >>>>> iteration and total 9 linear iterations. The functions MatLUFactorSym and >>>>> MatLUFactorNum are only called for three times. >>>>> >>>>> MatLUFactorSym 3 1.0 1.4073e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>>>> 1.5e+01 1 0 0 0 2 1 0 0 0 2 0 >>>>> MatLUFactorNum 3 1.0 3.2754e+01 1.0 9.16e+09 1.0 0.0e+00 0.0e+00 >>>>> 0.0e+00 31 97 0 0 0 32 97 0 0 0 280 >>>>> >>>>> I checked the -info output. It shows that One >>>>> MatLUFactorSymbolic_SeqAIJ() is called in down smoother of the first >>>>> SNES, one MatLUFactorSymbolic_SeqAIJ() is called in the coarse solve of >>>>> the first SNES, and one MatLUFactorSymbolic_SeqAIJ() is called in the >>>>> down smoother of the second SNES. >>>>> >>>>> Do you have any ideas why there are 9 multigrid iterations, but only 3 >>>>> MatLUFactorSymbolic calls in the program? >>>>> >>>>> Best >>>>> >>>>> Yuqi >>>>> >>>>> >>>>> >>>>> >>>>> ---- Original message ---- >>>>>> Date: Tue, 3 Apr 2012 20:08:27 -0500 >>>>>> From: petsc-users-bounces at mcs.anl.gov (on behalf of Barry Smith >>>>>> <bsmith at mcs.anl.gov>) >>>>>> Subject: Re: [petsc-users] Questions about PCMG >>>>>> To: PETSc users list <petsc-users at mcs.anl.gov> >>>>>> >>>>>> >>>>>> There are two linear solves (for 1 SNES and 2 SNES) so there are two >>>>>> MGSetUp on each level. Then a total of 9 multigrid iterations (in both >>>>>> linear solves together) hence 9 smoother on level 0 (level 0 means >>>>>> coarse grid solve). One smooth down and one smooth up on level 1 hence >>>>>> 18 total smooths on level 1. 9 computation of residual on level 1 and >>>>>> 18 MgInterp because that logs both the restriction to level 0 and the >>>>>> interpolation back to level 1 and 18 = 9 + 9. >>>>>> >>>>>> Barry >>>>>> >>>>>> On Apr 3, 2012, at 7:57 PM, Yuqi Wu wrote: >>>>>> >>>>>>> Hi, Barry, >>>>>>> >>>>>>> Thank you. If my program converges in two SNES iteration, >>>>>>> 0 SNES norm 1.014991e+02, 0 KSP its (nan coarse its average), last norm >>>>>>> 0.000000e+00 >>>>>>> 1 SNES norm 9.925218e-05, 4 KSP its (5.25 coarse its average), last >>>>>>> norm 2.268574e-06. >>>>>>> 2 SNES norm 1.397282e-09, 5 KSP its (5.20 coarse its average), last >>>>>>> norm 1.312605e-12. >>>>>>> >>>>>>> And -pc_mg_log shows the following output >>>>>>> >>>>>>> MGSetup Level 0 2 1.0 3.4091e-01 2.1 0.00e+00 0.0 3.0e+02 >>>>>>> 6.0e+04 3.0e+01 1 0 3 11 2 1 0 3 11 2 0 >>>>>>> MGSmooth Level 0 9 1.0 1.2126e+01 1.0 9.38e+08 3.2 2.8e+03 >>>>>>> 1.7e+03 6.4e+02 33 71 28 3 34 35 71 28 3 35 415 >>>>>>> MGSetup Level 1 2 1.0 1.3925e-01 2.1 0.00e+00 0.0 1.5e+02 >>>>>>> 3.1e+04 2.3e+01 0 0 1 3 1 0 0 1 3 1 0 >>>>>>> MGSmooth Level 1 18 1.0 5.8493e+00 1.0 3.66e+08 3.1 1.5e+03 >>>>>>> 2.9e+03 3.6e+02 16 28 15 3 19 17 28 15 3 19 339 >>>>>>> MGResid Level 1 9 1.0 1.1826e-01 1.4 1.49e+06 2.4 2.0e+02 >>>>>>> 2.7e+03 9.0e+00 0 0 2 0 0 0 0 2 0 0 70 >>>>>>> MGInterp Level 1 18 1.0 1.2317e-01 1.3 7.74e+05 2.2 3.8e+02 >>>>>>> 1.1e+03 1.8e+01 0 0 4 0 1 0 0 4 0 1 37 >>>>>>> >>>>>>> What are the MGSmooth, MGResid, MGInterp represent for? >>>>>>> >>>>>>> Best >>>>>>> >>>>>>> Yuqi >>>>>>> >>>>>>> ---- Original message ---- >>>>>>>> Date: Tue, 3 Apr 2012 19:19:23 -0500 >>>>>>>> From: petsc-users-bounces at mcs.anl.gov (on behalf of Barry Smith >>>>>>>> <bsmith at mcs.anl.gov>) >>>>>>>> Subject: Re: [petsc-users] Questions about PCMG >>>>>>>> To: PETSc users list <petsc-users at mcs.anl.gov> >>>>>>>> >>>>>>>> >>>>>>>> -pc_mg_log doesn't have anything to do with DA or DMMG it is part of >>>>>>>> the basic PCMG. Are you sure you are calling SNESSetFromOptions()? >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>> On Apr 3, 2012, at 6:56 PM, Yuqi Wu wrote: >>>>>>>> >>>>>>>>> Hi, Mark, >>>>>>>>> >>>>>>>>> Thank you so much for your suggestion. >>>>>>>>> >>>>>>>>> The problem 1 is resolved by avoiding calling PCMGSetNumberSmoothUp. >>>>>>>>> >>>>>>>>> But since I am using the unstructured grid in my application, I >>>>>>>>> didn't use DA or dmmg, so -pc_mg_log didn't give any level >>>>>>>>> information. I try to run my code using -info with 1 processor, and I >>>>>>>>> find out some interesting issues. >>>>>>>> >>>>>> >>>>> >>>> >> >
