Re: [petsc-users] PCASMType

Boyce Griffith Thu, 04 Aug 2016 22:28:08 -0700

> On Aug 4, 2016, at 9:52 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
> 
> 
>   The magic handling of _1_ etc is all done in PetscOptionsFindPair_Private() 
> so you need to put a break point in that routine and see why the requested 
> value is not located.


I haven’t tracked down the source of the problem with using _1_ etc, but I have 
checked to see what happens if I switch between basic/restrict/interpolate/none 
“manually” on each level, and I still see the same results for all choices.

I’ve checked the IS’es and am reasonably confident that they are being 
generated correctly the the “overlap” and “non-overlap” regions. It is 
definitely the case that the overlap region contains the non-overlap regions, 
and the overlap region is bigger (by the proper amount) from the non-overlap 
region.

It looks like ksp/ksp/examples/tutorials/ex8.c uses PCASMSetLocalSubdomains to 
set up the subdomains for ASM. If I run this example using, e.g.,

./ex8 -m 100 -n 100 -Mdomains 8 -Ndomains 8 -user_set_subdomains -ksp_rtol 
1.0e-3 -ksp_monitor -pc_asm_type XXXX

I get the same exact results for all different ASM types. I checked (using 
-ksp_view) that the ASM type settings were being honored. Are these subdomains 
not being setup to include overlaps (in which case I guess all ASM versions 
would yield the same results)?

Thanks,

— Boyce

> 
>  Barry
> 
> 
>> On Aug 4, 2016, at 9:46 PM, Boyce Griffith <griff...@cims.nyu.edu> wrote:
>> 
>> 
>>> On Aug 4, 2016, at 9:41 PM, Boyce Griffith <griff...@cims.nyu.edu> wrote:
>>> 
>>> 
>>>> On Aug 4, 2016, at 9:26 PM, Boyce Griffith <griff...@cims.nyu.edu> wrote:
>>>> 
>>>>> 
>>>>> On Aug 4, 2016, at 9:01 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
>>>>> 
>>>>>> 
>>>>>> On Aug 4, 2016, at 8:51 PM, Boyce Griffith <griff...@cims.nyu.edu> wrote:
>>>>>> 
>>>>>>> 
>>>>>>> On Aug 4, 2016, at 8:42 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> History,
>>>>>>> 
>>>>>>> 1) I originally implemented the ASM with one subdomain per process
>>>>>>> 2) easily extended to support multiple domain per process
>>>>>>> 3) added -pc_asm_type restrict etc but it only worked for one subdomain 
>>>>>>> per process because it took advantage of the fact that 
>>>>>>> restrict etc could be achieved by simply dropping the parallel 
>>>>>>> communication in the vector scatters
>>>>>>> 4) Matt didn't like the restriction to one process per subdomain so he 
>>>>>>> added an additional argument to PCASMSetLocalSubdomains() that allowed 
>>>>>>> passing in the overlapping and non-overlapping regions of each domain 
>>>>>>> (foolishly calling the non-overlapping index set is_local even though 
>>>>>>> local has nothing to do with), so that the restrict etc could be 
>>>>>>> handled.
>>>>>>> 
>>>>>>> Unfortunately IMHO Matt made a mess of things because if you use things 
>>>>>>> like -pc_asm_blocks n or  -pc_asm_overlap 1 etc it does not handle the 
>>>>>>> -pc_asm_type restrict since it cannot track the is vs is_local. The 
>>>>>>> code needs to be refactored so that things like -pc_asm_blocks and 
>>>>>>> -pc_asm_overlap 1 can track the is vs is_local index sets properly when 
>>>>>>> the -pc_asm_type is set. Also the name is_local needs to be changed to 
>>>>>>> something meaningfully like is_nonoverlapping This refactoring would 
>>>>>>> also result in easier cleaner code then is currently there.
>>>>>>> 
>>>>>>> So basically until the PCASM is refactored properly to handle restrict 
>>>>>>> etc you are stuck with being able to use the restrict etc ONLY if you 
>>>>>>> specifically supply the overlapping and non overlapping domains 
>>>>>>> yourself with PCASMSetLocalSubdomains and curse at Matt everyday like 
>>>>>>> we all do.
>>>>>> 
>>>>>> OK, got it. The reason I’m asking is that we are using PCASM in a custom 
>>>>>> smoother, and I noticed that basic/restrict/interpolate/none all give 
>>>>>> identical results. We are using PCASMSetLocalSubdomains to set up the 
>>>>>> subdomains.
>>>>> 
>>>>>   But are you setting different is and is_local (stupid name) and not 
>>>>> have PETSc computing the overlap in your custom code? If you are setting 
>>>>> them differently and not having PETSc compute overlap but getting 
>>>>> identical convergence then something is wrong and you likely have to run 
>>>>> in the debugger to insure that restrict etc is properly being set and 
>>>>> used.
>>>> 
>>>> Yes we are computing overlapping and non-overlapping IS’es.
>>>> 
>>>> I just double-checked, and somehow the ASMType setting is not making it 
>>>> from the command line into the solver configuration — sorry, I should have 
>>>> checked this more carefully before emailing the list. (I thought that the 
>>>> command line options were being captured correctly, since I am able to 
>>>> control the PC type and all of the sub-KSP/sub-PC settings.)
>>> 
>>> OK, so here is what appears to be happening. These solvers are named things 
>>> like “stokes_pc_level_0_”, “stokes_pc_level_1_”, … . If I use the 
>>> command-line argument
>>> 
>>>     -stokes_ib_pc_level_0_pc_asm_type basic
>>> 
>>> then the ASM settings are used, but if I do:
>>> 
>>>     -stokes_ib_pc_level_pc_asm_type basic
>>> 
>>> they are ignored. Any ideas? :-)
>> 
>> I should have said: we are playing around with a lot of different command 
>> line options that are being collectively applied to all of the level 
>> solvers, and these options for ASM are the only ones I’ve encountered so far 
>> that have to include the level number to have an effect.
>> 
>> Thanks,
>> 
>> — Boyce
>> 
>>> 
>>> Thanks,
>>> 
>>> — Boyce
>>> 
>>>>>> BTW, there is also this bit (which was easy to overlook in all of the 
>>>>>> repetitive convergence histories):
>>>>> 
>>>>> Yeah, better one question per email or we will miss them.
>>>>> 
>>>>>  There is nothing that says that multiplicative will ALWAYS beat 
>>>>> additive, though intuitively you expect it to.
>>>> 
>>>> OK, so similar story as above: we have a custom MSM that, when used as a 
>>>> MG smoother, gives convergence rates that are about 2x PCASM, whereas when 
>>>> we use PCASM with MULTIPLICATIVE, it doesn’t seem to help.
>>>> 
>>>> However, now I am questioning whether the settings are getting propagated 
>>>> into PCASM… I’ll need to take another look.
>>>> 
>>>> Thanks,
>>>> 
>>>> — Boyce
>>>> 
>>>>> 
>>>>>  Barry
>>>>> 
>>>>>> 
>>>>>>>> Also, the MULTIPLICATIVE variant does not seem to behave as I would 
>>>>>>>> expect --- for this same example, if you switch from ADDITIVE to 
>>>>>>>> MULTIPLICATIVE, the solver converges slightly more slowly:
>>>>>>>> 
>>>>>>>> $ ./ex2 -m 32 -n 32 -pc_type asm -pc_asm_blocks 8 -ksp_view 
>>>>>>>> -ksp_monitor_true_residual -pc_asm_local_type MULTIPLICATIVE
>>>>>>>> 0 KSP preconditioned resid norm 7.467363913958e+00 true resid norm 
>>>>>>>> 1.166190378969e+01 ||r(i)||/||b|| 1.000000000000e+00
>>>>>>>> 1 KSP preconditioned resid norm 2.878371937592e+00 true resid norm 
>>>>>>>> 3.646367718253e+00 ||r(i)||/||b|| 3.126734522949e-01
>>>>>>>> 2 KSP preconditioned resid norm 1.666575161021e+00 true resid norm 
>>>>>>>> 1.940699059619e+00 ||r(i)||/||b|| 1.664135714560e-01
>>>>>>>> 3 KSP preconditioned resid norm 1.086140238220e+00 true resid norm 
>>>>>>>> 1.191473615464e+00 ||r(i)||/||b|| 1.021680196433e-01
>>>>>>>> 4 KSP preconditioned resid norm 7.939217314942e-01 true resid norm 
>>>>>>>> 8.059317628307e-01 ||r(i)||/||b|| 6.910807852344e-02
>>>>>>>> 5 KSP preconditioned resid norm 6.265169154675e-01 true resid norm 
>>>>>>>> 5.942294290555e-01 ||r(i)||/||b|| 5.095475316653e-02
>>>>>>>> 6 KSP preconditioned resid norm 5.164999302721e-01 true resid norm 
>>>>>>>> 4.585844476718e-01 ||r(i)||/||b|| 3.932329197203e-02
>>>>>>>> 7 KSP preconditioned resid norm 4.472399844370e-01 true resid norm 
>>>>>>>> 3.884049472908e-01 ||r(i)||/||b|| 3.330544946136e-02
>>>>>>>> 8 KSP preconditioned resid norm 3.445446366213e-01 true resid norm 
>>>>>>>> 4.008290378967e-01 ||r(i)||/||b|| 3.437080644166e-02
>>>>>>>> 9 KSP preconditioned resid norm 1.987509894375e-01 true resid norm 
>>>>>>>> 2.619628925380e-01 ||r(i)||/||b|| 2.246313271505e-02
>>>>>>>> 10 KSP preconditioned resid norm 1.084551743751e-01 true resid norm 
>>>>>>>> 1.354891040098e-01 ||r(i)||/||b|| 1.161809481995e-02
>>>>>>>> 11 KSP preconditioned resid norm 6.108303419460e-02 true resid norm 
>>>>>>>> 7.252267103275e-02 ||r(i)||/||b|| 6.218767736436e-03
>>>>>>>> 12 KSP preconditioned resid norm 3.641579250431e-02 true resid norm 
>>>>>>>> 4.069996187932e-02 ||r(i)||/||b|| 3.489992938829e-03
>>>>>>>> 13 KSP preconditioned resid norm 2.424898818735e-02 true resid norm 
>>>>>>>> 2.469590201945e-02 ||r(i)||/||b|| 2.117656127577e-03
>>>>>>>> 14 KSP preconditioned resid norm 1.792399391125e-02 true resid norm 
>>>>>>>> 1.622090905110e-02 ||r(i)||/||b|| 1.390931475995e-03
>>>>>>>> 15 KSP preconditioned resid norm 1.320657155648e-02 true resid norm 
>>>>>>>> 1.336753101147e-02 ||r(i)||/||b|| 1.146256327657e-03
>>>>>>>> 16 KSP preconditioned resid norm 7.398524571182e-03 true resid norm 
>>>>>>>> 9.747691680405e-03 ||r(i)||/||b|| 8.358576657974e-04
>>>>>>>> 17 KSP preconditioned resid norm 3.043993613039e-03 true resid norm 
>>>>>>>> 3.848714422908e-03 ||r(i)||/||b|| 3.300245390731e-04
>>>>>>>> 18 KSP preconditioned resid norm 1.767867968946e-03 true resid norm 
>>>>>>>> 1.736586340170e-03 ||r(i)||/||b|| 1.489110501585e-04
>>>>>>>> 19 KSP preconditioned resid norm 1.088792656005e-03 true resid norm 
>>>>>>>> 1.307506936484e-03 ||r(i)||/||b|| 1.121177948355e-04
>>>>>>>> 20 KSP preconditioned resid norm 4.622653682144e-04 true resid norm 
>>>>>>>> 5.718427718734e-04 ||r(i)||/||b|| 4.903511315013e-05
>>>>>>>> 21 KSP preconditioned resid norm 2.591703287585e-04 true resid norm 
>>>>>>>> 2.690982547548e-04 ||r(i)||/||b|| 2.307498497738e-05
>>>>>>>> 22 KSP preconditioned resid norm 1.596527181997e-04 true resid norm 
>>>>>>>> 1.715846687846e-04 ||r(i)||/||b|| 1.471326396435e-05
>>>>>>>> 23 KSP preconditioned resid norm 1.006766623019e-04 true resid norm 
>>>>>>>> 1.044525361282e-04 ||r(i)||/||b|| 8.956731080268e-06
>>>>>>>> 24 KSP preconditioned resid norm 5.349814270060e-05 true resid norm 
>>>>>>>> 6.598682341705e-05 ||r(i)||/||b|| 5.658323427037e-06
>>>>>>>> KSP Object: 1 MPI processes
>>>>>>>> type: gmres
>>>>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt 
>>>>>>>> Orthogonalization with no iterative refinement
>>>>>>>> GMRES: happy breakdown tolerance 1e-30
>>>>>>>> maximum iterations=10000, initial guess is zero
>>>>>>>> tolerances:  relative=9.18274e-06, absolute=1e-50, divergence=10000.
>>>>>>>> left preconditioning
>>>>>>>> using PRECONDITIONED norm type for convergence test
>>>>>>>> PC Object: 1 MPI processes
>>>>>>>> type: asm
>>>>>>>> Additive Schwarz: total subdomain blocks = 8, amount of overlap = 1
>>>>>>>> Additive Schwarz: restriction/interpolation type - BASIC
>>>>>>>> Additive Schwarz: local solve composition type - MULTIPLICATIVE
>>>>>>>> Local solve is same for all blocks, in the following KSP and PC 
>>>>>>>> objects:
>>>>>>>> KSP Object:    (sub_)     1 MPI processes
>>>>>>>>   type: preonly
>>>>>>>>   maximum iterations=10000, initial guess is zero
>>>>>>>>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>>>>>>   left preconditioning
>>>>>>>>   using NONE norm type for convergence test
>>>>>>>> PC Object:    (sub_)     1 MPI processes
>>>>>>>>   type: icc
>>>>>>>>     0 levels of fill
>>>>>>>>     tolerance for zero pivot 2.22045e-14
>>>>>>>>     using Manteuffel shift [POSITIVE_DEFINITE]
>>>>>>>>     matrix ordering: natural
>>>>>>>>     factor fill ratio given 1., needed 1.
>>>>>>>>       Factored matrix follows:
>>>>>>>>         Mat Object:             1 MPI processes
>>>>>>>>           type: seqsbaij
>>>>>>>>           rows=160, cols=160
>>>>>>>>           package used to perform factorization: petsc
>>>>>>>>           total: nonzeros=443, allocated nonzeros=443
>>>>>>>>           total number of mallocs used during MatSetValues calls =0
>>>>>>>>               block size is 1
>>>>>>>>   linear system matrix = precond matrix:
>>>>>>>>   Mat Object:       1 MPI processes
>>>>>>>>     type: seqaij
>>>>>>>>     rows=160, cols=160
>>>>>>>>     total: nonzeros=726, allocated nonzeros=726
>>>>>>>>     total number of mallocs used during MatSetValues calls =0
>>>>>>>>       not using I-node routines
>>>>>>>> linear system matrix = precond matrix:
>>>>>>>> Mat Object:   1 MPI processes
>>>>>>>> type: seqaij
>>>>>>>> rows=1024, cols=1024
>>>>>>>> total: nonzeros=4992, allocated nonzeros=5120
>>>>>>>> total number of mallocs used during MatSetValues calls =0
>>>>>>>>   not using I-node routines
>>>>>>>> Norm of error 0.000292304 iterations 24
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> 
>>>>>>>> -- Boyce
>>> 
>>

Re: [petsc-users] PCASMType

Reply via email to