[petsc-users] Questions about PCMG

Mark F. Adams Wed, 4 Apr 2012 16:22:45 -0400

On Apr 4, 2012, at 3:21 PM, Yuqi Wu wrote:

> I have three SYMBOLIC factorization and three SYMBOLIC factorization done in 
> my program.
> 
> I didn't use the SAME_NONZERO_PATTERN is provided in the compute Jacobian 
> routine. If I use the flag then it will be two SYMBOLIC factorization and 
> three SYMBOLIC factorization done in my program.


2 SYMB and 3 (or it seems 4 to me) NUMERIC factorizations.

Mark

> 
> For -info output
> I got none message with "Setting PC with identical preconditioner"
> 
> I got five message with "Setting up new PC"
> 
> I got none message with "Setting up PC with same nonzero pattern"
> 
> I got three message with "Setting up PC with different nonzero pattern"
> 
> 
> For messages with "Setting up new PC", I got the following output 
> 
> [0] PCSetUp(): Setting up new PC
> [0] PCSetUp_MG(): Using outer operators to define finest grid operator 
>  because PCMGGetSmoother(pc,nlevels-1,&ksp);KSPSetOperators(ksp,...); was not 
> called.
> [0] MatGetSymbolicTranspose_SeqAIJ(): Getting Symbolic Transpose.
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 
> unneeded,656174 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410
> [0] Mat_CheckInode(): Found 1794 nodes of 4186. Limit used: 5. Using Inode 
> routines
> [0] MatRestoreSymbolicTranspose_SeqAIJ(): Restoring Symbolic Transpose.
> [0] MatPtAPSymbolic_SeqAIJ_SeqAIJ(): Reallocs 1; Fill ratio: given 1 needed 
> 1.43239.
> [0] MatPtAPSymbolic_SeqAIJ_SeqAIJ(): Use MatPtAP(A,P,MatReuse,1.43239,&C) for 
> best performance.
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 
> unneeded,656174 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410
> [0] PCSetUp(): Setting up new PC
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] VecScatterCreate(): Special case: sequential vector general to stride
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11585 X 11585; storage space: 0 
> unneeded,458097 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 168
> [0] Mat_CheckInode(): Found 8271 nodes of 11585. Limit used: 5. Using Inode 
> routines
> [0] PCSetUp(): Setting up new PC
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] VecScatterCreate(): Special case: sequential vector general to stride
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 
> unneeded,656174 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410
> [0] Mat_CheckInode(): Found 1794 nodes of 4186. Limit used: 5. Using Inode 
> routines
>    0 KSP Residual norm 1.014990964599e+02 
> [0] PCSetUp(): Setting up new PC
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 2 Fill ratio:given 5 needed 11.401
> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 11.401 or use 
> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,11.401);
> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance.
> [0] Mat_CheckInode_FactorLU(): Found 8057 nodes of 11585. Limit used: 5. 
> Using Inode routines
> [0] KSPDefaultConverged(): Linear solver has converged. Residual norm 
> 5.755920981112e-13 is less than relative tolerance 1.000000000000e-05 times 
> initial right hand side norm 5.479558824115e+02 at iteration 1
> [0] PCSetUp(): Setting up new PC
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed 7.07175
> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use 
> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175);
> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance.
> [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. Using 
> Inode routines
>        Residual norms for coarse_ solve.
>        0 KSP Residual norm 5.698312810532e-16 
> 
> For the message with "Setting up PC with different nonzero pattern", I got 
> the following outputs
> 
> [0] PCSetUp(): Setting up PC with different nonzero pattern
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 
> unneeded,656174 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410
> [0] PCSetUp(): Setting up PC with different nonzero pattern
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 4186 X 4186; storage space: 0 
> unneeded,656174 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 410
> [0] Mat_CheckInode(): Found 1794 nodes of 4186. Limit used: 5. Using Inode 
> routines
>    0 KSP Residual norm 9.922628060272e-05 
> [0] PCSetUp(): Setting up PC with different nonzero pattern
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 
> -2080374783
> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed 7.07175
> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use 
> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175);
> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance.
> [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. Using 
> Inode routines
> 
> Thank you so much for your help.
> 
> Yuqi
> 
> 
>> Setting up new PC
>> Setting up PC with same nonzero pattern\
>> Setting up PC with different nonzero pattern\n
> 
> 
> ---- Original message ----
>> Date: Wed, 4 Apr 2012 13:55:55 -0500
>> From: petsc-users-bounces at mcs.anl.gov (on behalf of Barry Smith <bsmith 
>> at mcs.anl.gov>)
>> Subject: Re: [petsc-users] Questions about PCMG  
>> To: PETSc users list <petsc-users at mcs.anl.gov>
>> 
>> 
>> Note: In most applications the flag SAME_NONZERO_PATTERN is provided in the 
>> compute Jacobian routine, this means that the SYMBOLIC factorization needs 
>> to be only done ONCE per matrix; only the numeric factorization needs to be 
>> done when the nonzero values have changed (the symbolic need not be 
>> repeated). Are you using this flag? How many times in the NUMERIC 
>> factorization being done?
>> 
>>  When you run the program with -info it will print information of the form: 
>> (run on one process to make life simple)
>> 
>> Setting PC with identical preconditioner\
>> Setting up new PC
>> Setting up PC with same nonzero pattern\
>> Setting up PC with different nonzero pattern\n
>> 
>>  How many, and exactly what messages of this form are you getting?
>> 
>>   When all else fails you can run the program in the debugger to track what 
>> is happening and why. 
>> 
>> Put a breakpoint in PCSetUp() then each time it gets called use next to step 
>> through it to see what is happening.
>> 
>>  First thing to check, is PCSetUp() getting called on each level for each 
>> new SNES iteration? 
>> 
>>  Second thing, if it is then why is it not triggering the new numerical 
>> factorization.
>> 
>> 
>>   Barry
>> 
>> On Apr 4, 2012, at 1:34 PM, Yuqi Wu wrote:
>> 
>>> Thanks, Adam.
>>> 
>>> Yes. I am using the Galerkin coarse grids. But I am not sure whether the 
>>> coarse grid is not getting refactored in the second SNES solve or the fine 
>>> grid smoother is not getting refactored in the second SNES solve.
>>> 
>>> In the -info output attached in the previous email, the fine grid matrix is 
>>> of size 11585 by 11585, and the coarse grid matrix is of size 4186 by 4186. 
>>> In the -info output, I found out three MatLUFactorSymbolic_SeqAIJ routines, 
>>> one for fine martrix, and two for coarse matrix.
>>> 
>>> 
>>> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 2 Fill ratio:given 5 needed 
>>> 11.401
>>> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 11.401 or use 
>>> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,11.401);
>>> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>>> [0] Mat_CheckInode_FactorLU(): Found 8057 nodes of 11585. Limit used: 5. 
>>> Using Inode routines
>>> 
>>> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed 
>>> 7.07175
>>> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use 
>>> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175);
>>> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>>> [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. 
>>> Using Inode routines
>>> 
>>> [0] MatLUFactorSymbolic_SeqAIJ(): Reallocs 1 Fill ratio:given 5 needed 
>>> 7.07175
>>> [0] MatLUFactorSymbolic_SeqAIJ(): Run with -pc_factor_fill 7.07175 or use 
>>> [0] MatLUFactorSymbolic_SeqAIJ(): PCFactorSetFill(pc,7.07175);
>>> [0] MatLUFactorSymbolic_SeqAIJ(): for best performance.
>>> [0] Mat_CheckInode_FactorLU(): Found 1764 nodes of 4186. Limit used: 5. 
>>> Using Inode routines
>>> 
>>> So I believe that the fine grid smoother is not getting refactored in the 
>>> second SNES solve.
>>> 
>>> Best
>>> 
>>> Yuqi
>>> 
>>> 
>>> 
>>> ---- Original message ----
>>>> Date: Wed, 4 Apr 2012 14:24:28 -0400
>>>> From: petsc-users-bounces at mcs.anl.gov (on behalf of "Mark F. Adams" 
>>>> <mark.adams at columbia.edu>)
>>>> Subject: Re: [petsc-users] Questions about PCMG  
>>>> To: PETSc users list <petsc-users at mcs.anl.gov>
>>>> 
>>>> I would expect 4 calls to MatLUFactorSym here.  It looks like the coarse 
>>>> grid is not getting refactored in the second SNES solve.
>>>> 
>>>> Are you using Galerkin coarse grids?  Perhaps you are not setting a new 
>>>> coarse grid with KSPSetOperator and so MG does not bother refactoring it.
>>>> 
>>>> Mark
>>>> 
>>>> On Apr 4, 2012, at 1:53 PM, Yuqi Wu wrote:
>>>> 
>>>>> Thank you.
>>>>> 
>>>>> Can I ask another question? 
>>>>> 
>>>>> In my log summary output, it shows that although there are two SNES 
>>>>> iteration and total 9 linear iterations. The functions MatLUFactorSym and 
>>>>> MatLUFactorNum are only called for three times. 
>>>>> 
>>>>> MatLUFactorSym         3 1.0 1.4073e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
>>>>> 1.5e+01  1  0  0  0  2   1  0  0  0  2     0
>>>>> MatLUFactorNum         3 1.0 3.2754e+01 1.0 9.16e+09 1.0 0.0e+00 0.0e+00 
>>>>> 0.0e+00 31 97  0  0  0  32 97  0  0  0   280
>>>>> 
>>>>> I checked the -info output. It shows that One 
>>>>> MatLUFactorSymbolic_SeqAIJ() is called in down smoother of the first 
>>>>> SNES, one MatLUFactorSymbolic_SeqAIJ() is called in the coarse solve of 
>>>>> the first SNES, and one MatLUFactorSymbolic_SeqAIJ() is called in the 
>>>>> down smoother of the second SNES.
>>>>> 
>>>>> Do you have any ideas why there are 9 multigrid iterations, but only 3 
>>>>> MatLUFactorSymbolic calls in the program?
>>>>> 
>>>>> Best
>>>>> 
>>>>> Yuqi
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> ---- Original message ----
>>>>>> Date: Tue, 3 Apr 2012 20:08:27 -0500
>>>>>> From: petsc-users-bounces at mcs.anl.gov (on behalf of Barry Smith 
>>>>>> <bsmith at mcs.anl.gov>)
>>>>>> Subject: Re: [petsc-users] Questions about PCMG  
>>>>>> To: PETSc users list <petsc-users at mcs.anl.gov>
>>>>>> 
>>>>>> 
>>>>>> There are two linear solves (for 1 SNES and 2 SNES) so there are two 
>>>>>> MGSetUp on each level. Then a total of 9 multigrid iterations (in both 
>>>>>> linear solves together) hence 9 smoother on level 0 (level 0 means 
>>>>>> coarse grid solve). One smooth down and one smooth up on level 1 hence 
>>>>>> 18 total smooths on level 1.  9 computation of residual on level 1 and 
>>>>>> 18 MgInterp because that logs both the restriction to level 0 and the 
>>>>>> interpolation back to level 1 and 18 = 9 + 9.
>>>>>> 
>>>>>> Barry
>>>>>> 
>>>>>> On Apr 3, 2012, at 7:57 PM, Yuqi Wu wrote:
>>>>>> 
>>>>>>> Hi, Barry,
>>>>>>> 
>>>>>>> Thank you. If my program converges in two SNES iteration,
>>>>>>> 0 SNES norm 1.014991e+02, 0 KSP its (nan coarse its average), last norm 
>>>>>>> 0.000000e+00
>>>>>>> 1 SNES norm 9.925218e-05, 4 KSP its (5.25 coarse its average), last 
>>>>>>> norm 2.268574e-06.
>>>>>>> 2 SNES norm 1.397282e-09, 5 KSP its (5.20 coarse its average), last 
>>>>>>> norm 1.312605e-12.
>>>>>>> 
>>>>>>> And -pc_mg_log shows the following output
>>>>>>> 
>>>>>>> MGSetup Level 0        2 1.0 3.4091e-01 2.1 0.00e+00 0.0 3.0e+02 
>>>>>>> 6.0e+04 3.0e+01  1  0  3 11  2   1  0  3 11  2     0
>>>>>>> MGSmooth Level 0       9 1.0 1.2126e+01 1.0 9.38e+08 3.2 2.8e+03 
>>>>>>> 1.7e+03 6.4e+02 33 71 28  3 34  35 71 28  3 35   415
>>>>>>> MGSetup Level 1        2 1.0 1.3925e-01 2.1 0.00e+00 0.0 1.5e+02 
>>>>>>> 3.1e+04 2.3e+01  0  0  1  3  1   0  0  1  3  1     0
>>>>>>> MGSmooth Level 1      18 1.0 5.8493e+00 1.0 3.66e+08 3.1 1.5e+03 
>>>>>>> 2.9e+03 3.6e+02 16 28 15  3 19  17 28 15  3 19   339
>>>>>>> MGResid Level 1        9 1.0 1.1826e-01 1.4 1.49e+06 2.4 2.0e+02 
>>>>>>> 2.7e+03 9.0e+00  0  0  2  0  0   0  0  2  0  0    70
>>>>>>> MGInterp Level 1      18 1.0 1.2317e-01 1.3 7.74e+05 2.2 3.8e+02 
>>>>>>> 1.1e+03 1.8e+01  0  0  4  0  1   0  0  4  0  1    37
>>>>>>> 
>>>>>>> What are the MGSmooth, MGResid, MGInterp represent for?
>>>>>>> 
>>>>>>> Best
>>>>>>> 
>>>>>>> Yuqi
>>>>>>> 
>>>>>>> ---- Original message ----
>>>>>>>> Date: Tue, 3 Apr 2012 19:19:23 -0500
>>>>>>>> From: petsc-users-bounces at mcs.anl.gov (on behalf of Barry Smith 
>>>>>>>> <bsmith at mcs.anl.gov>)
>>>>>>>> Subject: Re: [petsc-users] Questions about PCMG  
>>>>>>>> To: PETSc users list <petsc-users at mcs.anl.gov>
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -pc_mg_log doesn't have anything to do with DA or DMMG it is part of 
>>>>>>>> the basic PCMG. Are you sure you are calling SNESSetFromOptions()?
>>>>>>>> 
>>>>>>>> Barry
>>>>>>>> 
>>>>>>>> On Apr 3, 2012, at 6:56 PM, Yuqi Wu wrote:
>>>>>>>> 
>>>>>>>>> Hi, Mark,
>>>>>>>>> 
>>>>>>>>> Thank you so much for your suggestion.
>>>>>>>>> 
>>>>>>>>> The problem 1 is resolved by avoiding calling PCMGSetNumberSmoothUp. 
>>>>>>>>> 
>>>>>>>>> But since I am using the unstructured grid in my application, I 
>>>>>>>>> didn't use DA or dmmg, so -pc_mg_log didn't give any level 
>>>>>>>>> information. I try to run my code using -info with 1 processor, and I 
>>>>>>>>> find out some interesting issues.
>>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>> 
>

[petsc-users] Questions about PCMG

Reply via email to