On Apr 4, 2012, at 3:22 PM, Mark F. Adams wrote: > > On Apr 4, 2012, at 3:21 PM, Yuqi Wu wrote: > >> I have three SYMBOLIC factorization and three SYMBOLIC factorization done in >> my program. >> >> I didn't use the SAME_NONZERO_PATTERN is provided in the compute Jacobian >> routine. If I use the flag then it will be two SYMBOLIC factorization and >> three SYMBOLIC factorization done in my program. > > 2 SYMB and 3 (or it seems 4 to me) NUMERIC factorizations.
I also think it should be 4 NUMERIC factorization. To the debugger :-) Barry > > Mark > >> >> For -info output >> I got none message with "Setting PC with identical preconditioner" >> >> I got five message with "Setting up new PC" >> >> I got none message with "Setting up PC with same nonzero pattern" >> >> I got three message with "Setting up PC with different nonzero pattern" >> >> >> For messages with "Setting up new PC", I got the following output >> >> [0] PCSetUp(): Setting up new PC >> [0] PCSetUp_MG(): Using outer operators to define finest grid operator >> because PCMGGetSmoother(pc,nlevels-1,&ksp);KSPSetOperators(ksp,...); was not >> called. >> [0] MatGetSymbolicTranspose_SeqAIJ(): Getting Symbolic Transpose. >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 >> unneeded,656174 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410 >> [0] Mat_CheckInode(): Found 1794 nodes of 4186. Limit used: 5. Using Inode >> routines >> [0] MatRestoreSymbolicTranspose_SeqAIJ(): Restoring Symbolic Transpose. >> [0] MatPtAPSymbolic_SeqAIJ_SeqAIJ(): Reallocs 1; Fill ratio: given 1 needed >> 1.43239. >> [0] MatPtAPSymbolic_SeqAIJ_SeqAIJ(): Use MatPtAP(A,P,MatReuse,1.43239,&C) >> for best performance. >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 >> unneeded,656174 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410 >> [0] PCSetUp(): Setting up new PC >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] VecScatterCreate(): Special case: sequential vector general to stride >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11585 X 11585; storage space: 0 >> unneeded,458097 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 168 >> [0] Mat_CheckInode(): Found 8271 nodes of 11585. Limit used: 5. Using Inode >> routines >> [0] PCSetUp(): Setting up new PC >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] VecScatterCreate(): Special case: sequential vector general to stride >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 >> unneeded,656174 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410 >> [0] Mat_CheckInode(): Found 1794 nodes of 4186. Limit used: 5. Using Inode >> routines >> 0 KSP Residual norm 1.014990964599e+02 >> [0] PCSetUp(): Setting up new PC >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 2 Fill ratio:given 5 needed 11.401 >> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 11.401 or use >> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,11.401); >> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. >> [0] Mat_CheckInode_FactorLU(): Found 8057 nodes of 11585. Limit used: 5. >> Using Inode routines >> [0] KSPDefaultConverged(): Linear solver has converged. Residual norm >> 5.755920981112e-13 is less than relative tolerance 1.000000000000e-05 times >> initial right hand side norm 5.479558824115e+02 at iteration 1 >> [0] PCSetUp(): Setting up new PC >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed >> 7.07175 >> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use >> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175); >> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. >> [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. >> Using Inode routines >> Residual norms for coarse_ solve. >> 0 KSP Residual norm 5.698312810532e-16 >> >> For the message with "Setting up PC with different nonzero pattern", I got >> the following outputs >> >> [0] PCSetUp(): Setting up PC with different nonzero pattern >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 >> unneeded,656174 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410 >> [0] PCSetUp(): Setting up PC with different nonzero pattern >> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 >> unneeded,656174 used >> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 >> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410 >> [0] Mat_CheckInode(): Found 1794 nodes of 4186. Limit used: 5. Using Inode >> routines >> 0 KSP Residual norm 9.922628060272e-05 >> [0] PCSetUp(): Setting up PC with different nonzero pattern >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 >> -2080374783 >> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed >> 7.07175 >> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use >> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175); >> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. >> [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. >> Using Inode routines >> >> Thank you so much for your help. >> >> Yuqi >> >> >>> Setting up new PC >>> Setting up PC with same nonzero pattern\ >>> Setting up PC with different nonzero pattern\n >> >> >> ---- Original message ---- >>> Date: Wed, 4 Apr 2012 13:55:55 -0500 >>> From: petsc-users-bounces at mcs.anl.gov (on behalf of Barry Smith <bsmith >>> at mcs.anl.gov>) >>> Subject: Re: [petsc-users] Questions about PCMG >>> To: PETSc users list <petsc-users at mcs.anl.gov> >>> >>> >>> Note: In most applications the flag SAME_NONZERO_PATTERN is provided in the >>> compute Jacobian routine, this means that the SYMBOLIC factorization needs >>> to be only done ONCE per matrix; only the numeric factorization needs to be >>> done when the nonzero values have changed (the symbolic need not be >>> repeated). Are you using this flag? How many times in the NUMERIC >>> factorization being done? >>> >>> When you run the program with -info it will print information of the form: >>> (run on one process to make life simple) >>> >>> Setting PC with identical preconditioner\ >>> Setting up new PC >>> Setting up PC with same nonzero pattern\ >>> Setting up PC with different nonzero pattern\n >>> >>> How many, and exactly what messages of this form are you getting? >>> >>> When all else fails you can run the program in the debugger to track what >>> is happening and why. >>> >>> Put a breakpoint in PCSetUp() then each time it gets called use next to >>> step through it to see what is happening. >>> >>> First thing to check, is PCSetUp() getting called on each level for each >>> new SNES iteration? >>> >>> Second thing, if it is then why is it not triggering the new numerical >>> factorization. >>> >>> >>> Barry >>> >>> On Apr 4, 2012, at 1:34 PM, Yuqi Wu wrote: >>> >>>> Thanks, Adam. >>>> >>>> Yes. I am using the Galerkin coarse grids. But I am not sure whether the >>>> coarse grid is not getting refactored in the second SNES solve or the fine >>>> grid smoother is not getting refactored in the second SNES solve. >>>> >>>> In the -info output attached in the previous email, the fine grid matrix >>>> is of size 11585 by 11585, and the coarse grid matrix is of size 4186 by >>>> 4186. In the -info output, I found out three MatLUFactorSymbolic_SeqAIJ >>>> routines, one for fine martrix, and two for coarse matrix. >>>> >>>> >>>> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 2 Fill ratio:given 5 needed >>>> 11.401 >>>> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 11.401 or use >>>> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,11.401); >>>> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. >>>> [0] Mat_CheckInode_FactorLU(): Found 8057 nodes of 11585. Limit used: 5. >>>> Using Inode routines >>>> >>>> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed >>>> 7.07175 >>>> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use >>>> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175); >>>> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. >>>> [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. >>>> Using Inode routines >>>> >>>> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed >>>> 7.07175 >>>> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use >>>> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175); >>>> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance. >>>> [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. >>>> Using Inode routines >>>> >>>> So I believe that the fine grid smoother is not getting refactored in the >>>> second SNES solve. >>>> >>>> Best >>>> >>>> Yuqi >>>> >>>> >>>> >>>> ---- Original message ---- >>>>> Date: Wed, 4 Apr 2012 14:24:28 -0400 >>>>> From: petsc-users-bounces at mcs.anl.gov (on behalf of "Mark F. Adams" >>>>> <mark.adams at columbia.edu>) >>>>> Subject: Re: [petsc-users] Questions about PCMG >>>>> To: PETSc users list <petsc-users at mcs.anl.gov> >>>>> >>>>> I would expect 4 calls to MatLUFactorSym here. It looks like the coarse >>>>> grid is not getting refactored in the second SNES solve. >>>>> >>>>> Are you using Galerkin coarse grids? Perhaps you are not setting a new >>>>> coarse grid with KSPSetOperator and so MG does not bother refactoring it. >>>>> >>>>> Mark >>>>> >>>>> On Apr 4, 2012, at 1:53 PM, Yuqi Wu wrote: >>>>> >>>>>> Thank you. >>>>>> >>>>>> Can I ask another question? >>>>>> >>>>>> In my log summary output, it shows that although there are two SNES >>>>>> iteration and total 9 linear iterations. The functions MatLUFactorSym >>>>>> and MatLUFactorNum are only called for three times. >>>>>> >>>>>> MatLUFactorSym 3 1.0 1.4073e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >>>>>> 1.5e+01 1 0 0 0 2 1 0 0 0 2 0 >>>>>> MatLUFactorNum 3 1.0 3.2754e+01 1.0 9.16e+09 1.0 0.0e+00 0.0e+00 >>>>>> 0.0e+00 31 97 0 0 0 32 97 0 0 0 280 >>>>>> >>>>>> I checked the -info output. It shows that One >>>>>> MatLUFactorSymbolic_SeqAIJ() is called in down smoother of the first >>>>>> SNES, one MatLUFactorSymbolic_SeqAIJ() is called in the coarse solve of >>>>>> the first SNES, and one MatLUFactorSymbolic_SeqAIJ() is called in the >>>>>> down smoother of the second SNES. >>>>>> >>>>>> Do you have any ideas why there are 9 multigrid iterations, but only 3 >>>>>> MatLUFactorSymbolic calls in the program? >>>>>> >>>>>> Best >>>>>> >>>>>> Yuqi >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ---- Original message ---- >>>>>>> Date: Tue, 3 Apr 2012 20:08:27 -0500 >>>>>>> From: petsc-users-bounces at mcs.anl.gov (on behalf of Barry Smith >>>>>>> <bsmith at mcs.anl.gov>) >>>>>>> Subject: Re: [petsc-users] Questions about PCMG >>>>>>> To: PETSc users list <petsc-users at mcs.anl.gov> >>>>>>> >>>>>>> >>>>>>> There are two linear solves (for 1 SNES and 2 SNES) so there are two >>>>>>> MGSetUp on each level. Then a total of 9 multigrid iterations (in both >>>>>>> linear solves together) hence 9 smoother on level 0 (level 0 means >>>>>>> coarse grid solve). One smooth down and one smooth up on level 1 hence >>>>>>> 18 total smooths on level 1. 9 computation of residual on level 1 and >>>>>>> 18 MgInterp because that logs both the restriction to level 0 and the >>>>>>> interpolation back to level 1 and 18 = 9 + 9. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> On Apr 3, 2012, at 7:57 PM, Yuqi Wu wrote: >>>>>>> >>>>>>>> Hi, Barry, >>>>>>>> >>>>>>>> Thank you. If my program converges in two SNES iteration, >>>>>>>> 0 SNES norm 1.014991e+02, 0 KSP its (nan coarse its average), last >>>>>>>> norm 0.000000e+00 >>>>>>>> 1 SNES norm 9.925218e-05, 4 KSP its (5.25 coarse its average), last >>>>>>>> norm 2.268574e-06. >>>>>>>> 2 SNES norm 1.397282e-09, 5 KSP its (5.20 coarse its average), last >>>>>>>> norm 1.312605e-12. >>>>>>>> >>>>>>>> And -pc_mg_log shows the following output >>>>>>>> >>>>>>>> MGSetup Level 0 2 1.0 3.4091e-01 2.1 0.00e+00 0.0 3.0e+02 >>>>>>>> 6.0e+04 3.0e+01 1 0 3 11 2 1 0 3 11 2 0 >>>>>>>> MGSmooth Level 0 9 1.0 1.2126e+01 1.0 9.38e+08 3.2 2.8e+03 >>>>>>>> 1.7e+03 6.4e+02 33 71 28 3 34 35 71 28 3 35 415 >>>>>>>> MGSetup Level 1 2 1.0 1.3925e-01 2.1 0.00e+00 0.0 1.5e+02 >>>>>>>> 3.1e+04 2.3e+01 0 0 1 3 1 0 0 1 3 1 0 >>>>>>>> MGSmooth Level 1 18 1.0 5.8493e+00 1.0 3.66e+08 3.1 1.5e+03 >>>>>>>> 2.9e+03 3.6e+02 16 28 15 3 19 17 28 15 3 19 339 >>>>>>>> MGResid Level 1 9 1.0 1.1826e-01 1.4 1.49e+06 2.4 2.0e+02 >>>>>>>> 2.7e+03 9.0e+00 0 0 2 0 0 0 0 2 0 0 70 >>>>>>>> MGInterp Level 1 18 1.0 1.2317e-01 1.3 7.74e+05 2.2 3.8e+02 >>>>>>>> 1.1e+03 1.8e+01 0 0 4 0 1 0 0 4 0 1 37 >>>>>>>> >>>>>>>> What are the MGSmooth, MGResid, MGInterp represent for? >>>>>>>> >>>>>>>> Best >>>>>>>> >>>>>>>> Yuqi >>>>>>>> >>>>>>>> ---- Original message ---- >>>>>>>>> Date: Tue, 3 Apr 2012 19:19:23 -0500 >>>>>>>>> From: petsc-users-bounces at mcs.anl.gov (on behalf of Barry Smith >>>>>>>>> <bsmith at mcs.anl.gov>) >>>>>>>>> Subject: Re: [petsc-users] Questions about PCMG >>>>>>>>> To: PETSc users list <petsc-users at mcs.anl.gov> >>>>>>>>> >>>>>>>>> >>>>>>>>> -pc_mg_log doesn't have anything to do with DA or DMMG it is part of >>>>>>>>> the basic PCMG. Are you sure you are calling SNESSetFromOptions()? >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> On Apr 3, 2012, at 6:56 PM, Yuqi Wu wrote: >>>>>>>>> >>>>>>>>>> Hi, Mark, >>>>>>>>>> >>>>>>>>>> Thank you so much for your suggestion. >>>>>>>>>> >>>>>>>>>> The problem 1 is resolved by avoiding calling PCMGSetNumberSmoothUp. >>>>>>>>>> >>>>>>>>>> But since I am using the unstructured grid in my application, I >>>>>>>>>> didn't use DA or dmmg, so -pc_mg_log didn't give any level >>>>>>>>>> information. I try to run my code using -info with 1 processor, and >>>>>>>>>> I find out some interesting issues. >>>>>>>>> >>>>>>> >>>>>> >>>>> >>> >> >
