Re: [petsc-users] strange convergence

Hoang Giang Bui Sat, 29 Apr 2017 14:59:56 -0700

Thanks Barry

Running with that option gives the output for the first solve:


BoomerAMG SETUP PARAMETERS:

 Max levels = 25
 Num levels = 7

 Strength Threshold = 0.100000
 Interpolation Truncation Factor = 0.000000
 Maximum Row Sum Threshold for Dependency Weakening = 0.900000

 Coarsening Type = PMIS
 measures are determined locally


 No global partition option chosen.

 Interpolation = modified classical interpolation

Operator Matrix Information:

            nonzero         entries per row        row sums
lev   rows  entries  sparse  min  max   avg       min         max
===================================================================
 0 1056957 109424691  0.000    30 1617  103.5  -2.075e+11   3.561e+11
 1  185483 33504881  0.001    17  713  180.6  -3.493e+11   1.323e+13
 2   26295  4691629  0.007    17  513  178.4  -3.367e+10   6.960e+12
 3    3438   432138  0.037    24  295  125.7  -2.194e+10   2.154e+11
 4     476    34182  0.151     8  192  71.8  -6.435e+09   2.306e+11
 5      84     2410  0.342     8   70  28.7  -1.052e+07   6.640e+10
 6      18      252  0.778    10   18  14.0   9.038e+06   8.828e+10


Interpolation Matrix Information:
                 entries/row    min     max         row sums
lev  rows cols    min max     weight   weight     min       max
=================================================================
 0 1056957 x 185483   0  18  -1.143e+02 7.741e+01 -1.143e+02 7.741e+01
 1 185483 x 26295   0  15  -1.053e+01 2.918e+00 -1.053e+01 2.918e+00
 2 26295 x 3438    0   9   1.308e-02 1.036e+00 0.000e+00 1.058e+00
 3  3438 x 476     0   7   1.782e-02 1.015e+00 0.000e+00 1.015e+00
 4   476 x 84      0   5   1.378e-02 1.000e+00 0.000e+00 1.000e+00
 5    84 x 18      0   3   1.330e-02 1.000e+00 0.000e+00 1.000e+00


     Complexity:    grid = 1.204165
                operator = 1.353353
                memory = 1.381360

BoomerAMG SOLVER PARAMETERS:

  Maximum number of cycles:         1
  Stopping Tolerance:               0.000000e+00
  Cycle type (1 = V, 2 = W, etc.):  1

  Relaxation Parameters:
   Visiting Grid:                     down   up  coarse
            Number of sweeps:            1    1     1
   Type 0=Jac, 3=hGS, 6=hSGS, 9=GE:      6    6     6
   Point types, partial sweeps (1=C, -1=F):
                  Pre-CG relaxation (down):   1  -1
                   Post-CG relaxation (up):  -1   1
                             Coarsest grid:   0

Output flag (print_level): 3
                                            relative
               residual        factor       residual
               --------        ------       --------
    Initial    9.006493e+06                 1.000000e+00
    Cycle  1   7.994266e+06    0.887611 <08876%2011>     8.876114e-01


 Average Convergence Factor = 0.887611 <08876%2011>

     Complexity:    grid = 1.204165
                operator = 1.353353
                   cycle = 2.706703

KSP Object:(fieldsplit_u_) 8 MPI processes
  type: preonly
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
  using NONE norm type for convergence test
PC Object:(fieldsplit_u_) 8 MPI processes
  type: hypre
    HYPRE BoomerAMG preconditioning
    HYPRE BoomerAMG: Cycle type V
    HYPRE BoomerAMG: Maximum number of levels 25
    HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
    HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
    HYPRE BoomerAMG: Threshold for strong coupling 0.1
    HYPRE BoomerAMG: Interpolation truncation factor 0
    HYPRE BoomerAMG: Interpolation: max elements per row 0
    HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
    HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
    HYPRE BoomerAMG: Maximum row sums 0.9
    HYPRE BoomerAMG: Sweeps down         1
    HYPRE BoomerAMG: Sweeps up           1
    HYPRE BoomerAMG: Sweeps on coarse    1
    HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
    HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
    HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
    HYPRE BoomerAMG: Relax weight  (all)      1
    HYPRE BoomerAMG: Outer relax weight (all) 1
    HYPRE BoomerAMG: Using CF-relaxation
    HYPRE BoomerAMG: Measure type        local
    HYPRE BoomerAMG: Coarsen type        PMIS
    HYPRE BoomerAMG: Interpolation type  classical
  linear system matrix = precond matrix:
  Mat Object:  (fieldsplit_u_)   8 MPI processes
    type: mpiaij
    rows=1056957, cols=1056957, bs=3
    total: nonzeros=1.09425e+08, allocated nonzeros=1.09425e+08
    total number of mallocs used during MatSetValues calls =0
      using I-node (on process 0) routines: found 43537 nodes, limit used
is 5
  0 KSP preconditioned resid norm 4.076033642262e+00 true resid norm
9.006493083033e+06 ||r(i)||/||b|| 1.000000000000e+00



Giang

On Sat, Apr 29, 2017 at 8:06 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:

>
> > On Apr 29, 2017, at 8:34 AM, Jed Brown <j...@jedbrown.org> wrote:
> >
> > Hoang Giang Bui <hgbk2...@gmail.com> writes:
> >
> >> Hi Barry
> >>
> >> The first block is from a standard solid mechanics discretization based
> on
> >> balance of momentum equation. There is some material involved but in
> >> principal it's well-posed elasticity equation with positive definite
> >> tangent operator. The "gluing business" uses the mortar method to keep
> the
> >> continuity of displacement. Instead of using Lagrange multiplier to
> treat
> >> the constraint I used penalty method to penalize the energy. The
> >> discretization form of mortar is quite simple
> >>
> >> \int_{\Gamma_1} { rho * (\delta u_1 - \delta u_2) * (u_1 - u_2) dA }
> >>
> >> rho is penalty parameter. In the simulation I initially set it low (~E)
> to
> >> preserve the conditioning of the system.
> >
> > There are two things that can go wrong here with AMG:
> >
> > * The penalty term can mess up the strength of connection heuristics
> >  such that you get poor choice of C-points (classical AMG like
> >  BoomerAMG) or poor choice of aggregates (smoothed aggregation).
> >
> > * The penalty term can prevent Jacobi smoothing from being effective; in
> >  this case, it can lead to poor coarse basis functions (higher energy
> >  than they should be) and poor smoothing in an MG cycle.  You can fix
> >  the poor smoothing in the MG cycle by using a stronger smoother, like
> >  ASM with some overlap.
> >
> > I'm generally not a fan of penalty methods due to the irritating
> > tradeoffs and often poor solver performance.
>
>    So, let's first see what hypre BoomerAMG is doing with the system. Run
> for just one BoomerAMG solve with the additional options
>
> -fieldsplit_u_ksp_view -fieldsplit_u_pc_hypre_boomeramg_print_statistics
>
> this should print a good amount of information of what BoomerAMG has
> decided to do based on the input matrix.
>
> I'm bringing the hypre team into the conversation since they obviously
> know far more about BoomerAMG tuning options that may help your case.
>
>    Barry
>
>
>
> >
> >> In the figure below, the colorful blocks are u_1 and the base is u_2.
> Both
> >> u_1 and u_2 use isoparametric quadratic approximation.
> >>
> >> 
> >> Snapshot.png
> >> <https://drive.google.com/file/d/0Bw8Hmu0-YGQXc2hKQ1BhQ1I4OE
> U/view?usp=drive_web>
> >> 
> >>
> >> Giang
> >>
> >> On Fri, Apr 28, 2017 at 6:21 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> >>
> >>>
> >>>  Ok, so boomerAMG algebraic multigrid is not good for the first block.
> >>> You mentioned the first block has two things glued together? AMG is
> >>> fantastic for certain problems but doesn't work for everything.
> >>>
> >>>   Tell us more about the first block, what PDE it comes from, what
> >>> discretization, and what the "gluing business" is and maybe we'll have
> >>> suggestions for how to precondition it.
> >>>
> >>>   Barry
> >>>
> >>>> On Apr 28, 2017, at 3:56 AM, Hoang Giang Bui <hgbk2...@gmail.com>
> wrote:
> >>>>
> >>>> It's in fact quite good
> >>>>
> >>>>    Residual norms for fieldsplit_u_ solve.
> >>>>    0 KSP Residual norm 4.014715925568e+00
> >>>>    1 KSP Residual norm 2.160497019264e-10
> >>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>  0 KSP preconditioned resid norm 4.014715925568e+00 true resid norm
> >>> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
> >>>>    Residual norms for fieldsplit_u_ solve.
> >>>>    0 KSP Residual norm 9.999999999416e-01
> >>>>    1 KSP Residual norm 7.118380416383e-11
> >>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>  1 KSP preconditioned resid norm 1.701150951035e-10 true resid norm
> >>> 5.494262251846e-04 ||r(i)||/||b|| 6.100334726599e-11
> >>>> Linear solve converged due to CONVERGED_ATOL iterations 1
> >>>>
> >>>> Giang
> >>>>
> >>>> On Thu, Apr 27, 2017 at 5:25 PM, Barry Smith <bsm...@mcs.anl.gov>
> wrote:
> >>>>
> >>>>  Run again using LU on both blocks to see what happens.
> >>>>
> >>>>
> >>>>> On Apr 27, 2017, at 2:14 AM, Hoang Giang Bui <hgbk2...@gmail.com>
> >>> wrote:
> >>>>>
> >>>>> I have changed the way to tie the nonconforming mesh. It seems the
> >>> matrix now is better
> >>>>>
> >>>>> with -pc_type lu  the output is
> >>>>>  0 KSP preconditioned resid norm 3.308678584240e-01 true resid norm
> >>> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00
> >>>>>  1 KSP preconditioned resid norm 2.004313395301e-12 true resid norm
> >>> 2.549872332830e-05 ||r(i)||/||b|| 2.831148938173e-12
> >>>>> Linear solve converged due to CONVERGED_ATOL iterations 1
> >>>>>
> >>>>>
> >>>>> with -pc_type fieldsplit  -fieldsplit_u_pc_type hypre
> >>> -fieldsplit_wp_pc_type lu    the convergence is slow
> >>>>>  0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm
> >>> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
> >>>>>  1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm
> >>> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
> >>>>> ...
> >>>>> 824 KSP preconditioned resid norm 1.018542387738e-09 true resid norm
> >>> 2.906608839310e+02 ||r(i)||/||b|| 3.227237074804e-05
> >>>>> 825 KSP preconditioned resid norm 9.743727947637e-10 true resid norm
> >>> 2.820369993061e+02 ||r(i)||/||b|| 3.131485215062e-05
> >>>>> Linear solve converged due to CONVERGED_ATOL iterations 825
> >>>>>
> >>>>> checking with additional  -fieldsplit_u_ksp_type richardson
> >>> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
> >>> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
> >>> -fieldsplit_wp_ksp_max_it 1  gives
> >>>>>
> >>>>>  0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm
> >>> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00
> >>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>    0 KSP Residual norm 5.803507549280e-01
> >>>>>    1 KSP Residual norm 2.069538175950e-01
> >>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>>  1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm
> >>> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00
> >>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>    0 KSP Residual norm 7.831796195225e-01
> >>>>>    1 KSP Residual norm 1.734608520110e-01
> >>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>> ....
> >>>>> 823 KSP preconditioned resid norm 1.065070135605e-09 true resid norm
> >>> 3.081881356833e+02 ||r(i)||/||b|| 3.421843916665e-05
> >>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>    0 KSP Residual norm 6.113806394327e-01
> >>>>>    1 KSP Residual norm 1.535465290944e-01
> >>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>> 824 KSP preconditioned resid norm 1.018542387746e-09 true resid norm
> >>> 2.906608839353e+02 ||r(i)||/||b|| 3.227237074851e-05
> >>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>    0 KSP Residual norm 6.123437055586e-01
> >>>>>    1 KSP Residual norm 1.524661826133e-01
> >>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>> 825 KSP preconditioned resid norm 9.743727947718e-10 true resid norm
> >>> 2.820369990571e+02 ||r(i)||/||b|| 3.131485212298e-05
> >>>>> Linear solve converged due to CONVERGED_ATOL iterations 825
> >>>>>
> >>>>>
> >>>>> The residual for wp block is zero since in this first step the rhs is
> >>> zero. As can see in the output, the multigrid does not perform well to
> >>> reduce the residual in the sub-solve. Is my observation right? what
> can be
> >>> done to improve this?
> >>>>>
> >>>>>
> >>>>> Giang
> >>>>>
> >>>>> On Tue, Apr 25, 2017 at 12:17 AM, Barry Smith <bsm...@mcs.anl.gov>
> >>> wrote:
> >>>>>
> >>>>>   This can happen in the matrix is singular or nearly singular or if
> >>> the factorization generates small pivots, which can occur for even
> >>> nonsingular problems if the matrix is poorly scaled or just plain
> nasty.
> >>>>>
> >>>>>
> >>>>>> On Apr 24, 2017, at 5:10 PM, Hoang Giang Bui <hgbk2...@gmail.com>
> >>> wrote:
> >>>>>>
> >>>>>> It took a while, here I send you the output
> >>>>>>
> >>>>>>  0 KSP preconditioned resid norm 3.129073545457e+05 true resid norm
> >>> 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
> >>>>>>  1 KSP preconditioned resid norm 7.442444222843e-01 true resid norm
> >>> 1.003356247696e+02 ||r(i)||/||b|| 1.112966720375e-05
> >>>>>>  2 KSP preconditioned resid norm 3.267453132529e-07 true resid norm
> >>> 3.216722968300e+01 ||r(i)||/||b|| 3.568130084011e-06
> >>>>>>  3 KSP preconditioned resid norm 1.155046883816e-11 true resid norm
> >>> 3.234460376820e+01 ||r(i)||/||b|| 3.587805194854e-06
> >>>>>> Linear solve converged due to CONVERGED_ATOL iterations 3
> >>>>>> KSP Object: 4 MPI processes
> >>>>>>  type: gmres
> >>>>>>    GMRES: restart=1000, using Modified Gram-Schmidt
> >>> Orthogonalization
> >>>>>>    GMRES: happy breakdown tolerance 1e-30
> >>>>>>  maximum iterations=1000, initial guess is zero
> >>>>>>  tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
> >>>>>>  left preconditioning
> >>>>>>  using PRECONDITIONED norm type for convergence test
> >>>>>> PC Object: 4 MPI processes
> >>>>>>  type: lu
> >>>>>>    LU: out-of-place factorization
> >>>>>>    tolerance for zero pivot 2.22045e-14
> >>>>>>    matrix ordering: natural
> >>>>>>    factor fill ratio given 0, needed 0
> >>>>>>      Factored matrix follows:
> >>>>>>        Mat Object:         4 MPI processes
> >>>>>>          type: mpiaij
> >>>>>>          rows=973051, cols=973051
> >>>>>>          package used to perform factorization: pastix
> >>>>>>  Error :                        3.24786e-14
> >>>>>>          total: nonzeros=0, allocated nonzeros=0
> >>>>>>          total number of mallocs used during MatSetValues calls =0
> >>>>>>            PaStiX run parameters:
> >>>>>>              Matrix type :                      Unsymmetric
> >>>>>>              Level of printing (0,1,2):         0
> >>>>>>              Number of refinements iterations : 3
> >>>>>>  Error :                        3.24786e-14
> >>>>>>  linear system matrix = precond matrix:
> >>>>>>  Mat Object:   4 MPI processes
> >>>>>>    type: mpiaij
> >>>>>>    rows=973051, cols=973051
> >>>>>>  Error :                        3.24786e-14
> >>>>>>    total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
> >>>>>>    total number of mallocs used during MatSetValues calls =0
> >>>>>>      using I-node (on process 0) routines: found 78749 nodes, limit
> >>> used is 5
> >>>>>>  Error :                        3.24786e-14
> >>>>>>
> >>>>>> It doesn't do as you said. Something is not right here. I will look
> >>> in depth.
> >>>>>>
> >>>>>> Giang
> >>>>>>
> >>>>>> On Mon, Apr 24, 2017 at 8:21 PM, Barry Smith <bsm...@mcs.anl.gov>
> >>> wrote:
> >>>>>>
> >>>>>>> On Apr 24, 2017, at 12:47 PM, Hoang Giang Bui <hgbk2...@gmail.com>
> >>> wrote:
> >>>>>>>
> >>>>>>> Good catch. I get this for the very first step, maybe at that time
> >>> the rhs_w is zero.
> >>>>>>
> >>>>>>    With the multiplicative composition the right hand side of the
> >>> second solve is the initial right hand side of the second solve minus
> >>> A_10*x where x is the solution to the first sub solve and A_10 is the
> lower
> >>> left block of the outer matrix. So unless both the initial right hand
> side
> >>> has a zero for the second block and A_10 is identically zero the right
> hand
> >>> side for the second sub solve should not be zero. Is A_10 == 0?
> >>>>>>
> >>>>>>
> >>>>>>> In the later step, it shows 2 step convergence
> >>>>>>>
> >>>>>>> Residual norms for fieldsplit_u_ solve.
> >>>>>>>    0 KSP Residual norm 3.165886479830e+04
> >>>>>>>    1 KSP Residual norm 2.905922877684e-01
> >>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>    0 KSP Residual norm 2.397669419027e-01
> >>>>>>>    1 KSP Residual norm 0.000000000000e+00
> >>>>>>>  0 KSP preconditioned resid norm 3.165886479920e+04 true resid
> >>> norm 7.963616922323e+05 ||r(i)||/||b|| 1.000000000000e+00
> >>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>    0 KSP Residual norm 9.999891813771e-01
> >>>>>>>    1 KSP Residual norm 1.512000395579e-05
> >>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>    0 KSP Residual norm 8.192702188243e-06
> >>>>>>>    1 KSP Residual norm 0.000000000000e+00
> >>>>>>>  1 KSP preconditioned resid norm 5.252183822848e-02 true resid
> >>> norm 7.135927677844e+04 ||r(i)||/||b|| 8.960661653427e-02
> >>>>>>
> >>>>>>    The outer residual norms are still wonky, the preconditioned
> >>> residual norm goes from 3.165886479920e+04 to 5.252183822848e-02 which
> is a
> >>> huge drop but the 7.963616922323e+05  drops very much less
> >>> 7.135927677844e+04. This is not normal.
> >>>>>>
> >>>>>>   What if you just use -pc_type lu for the entire system (no
> >>> fieldsplit), does the true residual drop to almost zero in the first
> >>> iteration (as it should?). Send the output.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>    0 KSP Residual norm 6.946213936597e-01
> >>>>>>>    1 KSP Residual norm 1.195514007343e-05
> >>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>    0 KSP Residual norm 1.025694497535e+00
> >>>>>>>    1 KSP Residual norm 0.000000000000e+00
> >>>>>>>  2 KSP preconditioned resid norm 8.785709535405e-03 true resid
> >>> norm 1.419341799277e+04 ||r(i)||/||b|| 1.782282866091e-02
> >>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>    0 KSP Residual norm 7.255149996405e-01
> >>>>>>>    1 KSP Residual norm 6.583512434218e-06
> >>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>    0 KSP Residual norm 1.015229700337e+00
> >>>>>>>    1 KSP Residual norm 0.000000000000e+00
> >>>>>>>  3 KSP preconditioned resid norm 7.110407712709e-04 true resid
> >>> norm 5.284940654154e+02 ||r(i)||/||b|| 6.636357205153e-04
> >>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>    0 KSP Residual norm 3.512243341400e-01
> >>>>>>>    1 KSP Residual norm 2.032490351200e-06
> >>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>    0 KSP Residual norm 1.282327290982e+00
> >>>>>>>    1 KSP Residual norm 0.000000000000e+00
> >>>>>>>  4 KSP preconditioned resid norm 3.482036620521e-05 true resid
> >>> norm 4.291231924307e+01 ||r(i)||/||b|| 5.388546393133e-05
> >>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>    0 KSP Residual norm 3.423609338053e-01
> >>>>>>>    1 KSP Residual norm 4.213703301972e-07
> >>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>    0 KSP Residual norm 1.157384757538e+00
> >>>>>>>    1 KSP Residual norm 0.000000000000e+00
> >>>>>>>  5 KSP preconditioned resid norm 1.203470314534e-06 true resid
> >>> norm 4.544956156267e+00 ||r(i)||/||b|| 5.707150658550e-06
> >>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>    0 KSP Residual norm 3.838596289995e-01
> >>>>>>>    1 KSP Residual norm 9.927864176103e-08
> >>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>    0 KSP Residual norm 1.066298905618e+00
> >>>>>>>    1 KSP Residual norm 0.000000000000e+00
> >>>>>>>  6 KSP preconditioned resid norm 3.331619244266e-08 true resid
> >>> norm 2.821511729024e+00 ||r(i)||/||b|| 3.543002829675e-06
> >>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>    0 KSP Residual norm 4.624964188094e-01
> >>>>>>>    1 KSP Residual norm 6.418229775372e-08
> >>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>    0 KSP Residual norm 9.800784311614e-01
> >>>>>>>    1 KSP Residual norm 0.000000000000e+00
> >>>>>>>  7 KSP preconditioned resid norm 8.788046233297e-10 true resid
> >>> norm 2.849209671705e+00 ||r(i)||/||b|| 3.577783436215e-06
> >>>>>>> Linear solve converged due to CONVERGED_ATOL iterations 7
> >>>>>>>
> >>>>>>> The outer operator is an explicit matrix.
> >>>>>>>
> >>>>>>> Giang
> >>>>>>>
> >>>>>>> On Mon, Apr 24, 2017 at 7:32 PM, Barry Smith <bsm...@mcs.anl.gov>
> >>> wrote:
> >>>>>>>
> >>>>>>>> On Apr 24, 2017, at 3:16 AM, Hoang Giang Bui <hgbk2...@gmail.com>
> >>> wrote:
> >>>>>>>>
> >>>>>>>> Thanks Barry, trying with -fieldsplit_u_type lu gives better
> >>> convergence. I still used 4 procs though, probably with 1 proc it
> should
> >>> also be the same.
> >>>>>>>>
> >>>>>>>> The u block used a Nitsche-type operator to connect two
> >>> non-matching domains. I don't think it will leave some rigid body
> motion
> >>> leads to not sufficient constraints. Maybe you have other idea?
> >>>>>>>>
> >>>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>>    0 KSP Residual norm 3.129067184300e+05
> >>>>>>>>    1 KSP Residual norm 5.906261468196e-01
> >>>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>>>>
> >>>>>>>    ^^^^ something is wrong here. The sub solve should not be
> >>> starting with a 0 residual (this means the right hand side for this sub
> >>> solve is zero which it should not be).
> >>>>>>>
> >>>>>>>> FieldSplit with MULTIPLICATIVE composition: total splits = 2
> >>>>>>>
> >>>>>>>
> >>>>>>>   How are you providing the outer operator? As an explicit matrix
> >>> or with some shell matrix?
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>  0 KSP preconditioned resid norm 3.129067184300e+05 true resid
> >>> norm 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00
> >>>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>>    0 KSP Residual norm 9.999955993437e-01
> >>>>>>>>    1 KSP Residual norm 4.019774691831e-06
> >>>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>>>>>  1 KSP preconditioned resid norm 5.003913641475e-01 true resid
> >>> norm 4.692996324114e+01 ||r(i)||/||b|| 5.205677185522e-06
> >>>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>>    0 KSP Residual norm 1.000012180204e+00
> >>>>>>>>    1 KSP Residual norm 1.017367950422e-05
> >>>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>>>>>  2 KSP preconditioned resid norm 2.330910333756e-07 true resid
> >>> norm 3.474855463983e+01 ||r(i)||/||b|| 3.854461960453e-06
> >>>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>>    0 KSP Residual norm 1.000004200085e+00
> >>>>>>>>    1 KSP Residual norm 6.231613102458e-06
> >>>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>>>>>  3 KSP preconditioned resid norm 8.671259838389e-11 true resid
> >>> norm 3.545103468011e+01 ||r(i)||/||b|| 3.932384125024e-06
> >>>>>>>> Linear solve converged due to CONVERGED_ATOL iterations 3
> >>>>>>>> KSP Object: 4 MPI processes
> >>>>>>>>  type: gmres
> >>>>>>>>    GMRES: restart=1000, using Modified Gram-Schmidt
> >>> Orthogonalization
> >>>>>>>>    GMRES: happy breakdown tolerance 1e-30
> >>>>>>>>  maximum iterations=1000, initial guess is zero
> >>>>>>>>  tolerances:  relative=1e-20, absolute=1e-09, divergence=10000
> >>>>>>>>  left preconditioning
> >>>>>>>>  using PRECONDITIONED norm type for convergence test
> >>>>>>>> PC Object: 4 MPI processes
> >>>>>>>>  type: fieldsplit
> >>>>>>>>    FieldSplit with MULTIPLICATIVE composition: total splits = 2
> >>>>>>>>    Solver info for each split is in the following KSP objects:
> >>>>>>>>    Split number 0 Defined by IS
> >>>>>>>>    KSP Object:    (fieldsplit_u_)     4 MPI processes
> >>>>>>>>      type: richardson
> >>>>>>>>        Richardson: damping factor=1
> >>>>>>>>      maximum iterations=1, initial guess is zero
> >>>>>>>>      tolerances:  relative=1e-05, absolute=1e-50,
> >>> divergence=10000
> >>>>>>>>      left preconditioning
> >>>>>>>>      using PRECONDITIONED norm type for convergence test
> >>>>>>>>    PC Object:    (fieldsplit_u_)     4 MPI processes
> >>>>>>>>      type: lu
> >>>>>>>>        LU: out-of-place factorization
> >>>>>>>>        tolerance for zero pivot 2.22045e-14
> >>>>>>>>        matrix ordering: natural
> >>>>>>>>        factor fill ratio given 0, needed 0
> >>>>>>>>          Factored matrix follows:
> >>>>>>>>            Mat Object:             4 MPI processes
> >>>>>>>>              type: mpiaij
> >>>>>>>>              rows=938910, cols=938910
> >>>>>>>>              package used to perform factorization: pastix
> >>>>>>>>              total: nonzeros=0, allocated nonzeros=0
> >>>>>>>>      Error :                        3.36878e-14
> >>>>>>>>          total number of mallocs used during MatSetValues calls
> >>> =0
> >>>>>>>>                PaStiX run parameters:
> >>>>>>>>                  Matrix type :                      Unsymmetric
> >>>>>>>>                  Level of printing (0,1,2):         0
> >>>>>>>>                  Number of refinements iterations : 3
> >>>>>>>>  Error :                        3.36878e-14
> >>>>>>>>      linear system matrix = precond matrix:
> >>>>>>>>      Mat Object:      (fieldsplit_u_)       4 MPI processes
> >>>>>>>>        type: mpiaij
> >>>>>>>>        rows=938910, cols=938910, bs=3
> >>>>>>>>  Error :                        3.36878e-14
> >>>>>>>>  Error :                        3.36878e-14
> >>>>>>>>        total: nonzeros=8.60906e+07, allocated
> >>> nonzeros=8.60906e+07
> >>>>>>>>        total number of mallocs used during MatSetValues calls =0
> >>>>>>>>          using I-node (on process 0) routines: found 78749
> >>> nodes, limit used is 5
> >>>>>>>>    Split number 1 Defined by IS
> >>>>>>>>    KSP Object:    (fieldsplit_wp_)     4 MPI processes
> >>>>>>>>      type: richardson
> >>>>>>>>        Richardson: damping factor=1
> >>>>>>>>      maximum iterations=1, initial guess is zero
> >>>>>>>>      tolerances:  relative=1e-05, absolute=1e-50,
> >>> divergence=10000
> >>>>>>>>      left preconditioning
> >>>>>>>>      using PRECONDITIONED norm type for convergence test
> >>>>>>>>    PC Object:    (fieldsplit_wp_)     4 MPI processes
> >>>>>>>>      type: lu
> >>>>>>>>        LU: out-of-place factorization
> >>>>>>>>        tolerance for zero pivot 2.22045e-14
> >>>>>>>>        matrix ordering: natural
> >>>>>>>>        factor fill ratio given 0, needed 0
> >>>>>>>>          Factored matrix follows:
> >>>>>>>>            Mat Object:             4 MPI processes
> >>>>>>>>              type: mpiaij
> >>>>>>>>              rows=34141, cols=34141
> >>>>>>>>              package used to perform factorization: pastix
> >>>>>>>>                Error :                        -nan
> >>>>>>>>  Error :                        -nan
> >>>>>>>>  Error :                        -nan
> >>>>>>>> total: nonzeros=0, allocated nonzeros=0
> >>>>>>>>              total number of mallocs used during MatSetValues
> >>> calls =0
> >>>>>>>>                PaStiX run parameters:
> >>>>>>>>                  Matrix type :                      Symmetric
> >>>>>>>>                  Level of printing (0,1,2):         0
> >>>>>>>>                  Number of refinements iterations : 0
> >>>>>>>>  Error :                        -nan
> >>>>>>>>      linear system matrix = precond matrix:
> >>>>>>>>      Mat Object:      (fieldsplit_wp_)       4 MPI processes
> >>>>>>>>        type: mpiaij
> >>>>>>>>        rows=34141, cols=34141
> >>>>>>>>        total: nonzeros=485655, allocated nonzeros=485655
> >>>>>>>>        total number of mallocs used during MatSetValues calls =0
> >>>>>>>>          not using I-node (on process 0) routines
> >>>>>>>>  linear system matrix = precond matrix:
> >>>>>>>>  Mat Object:   4 MPI processes
> >>>>>>>>    type: mpiaij
> >>>>>>>>    rows=973051, cols=973051
> >>>>>>>>    total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07
> >>>>>>>>    total number of mallocs used during MatSetValues calls =0
> >>>>>>>>      using I-node (on process 0) routines: found 78749 nodes,
> >>> limit used is 5
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Giang
> >>>>>>>>
> >>>>>>>> On Sun, Apr 23, 2017 at 10:19 PM, Barry Smith <
> >>> bsm...@mcs.anl.gov> wrote:
> >>>>>>>>
> >>>>>>>>> On Apr 23, 2017, at 2:42 PM, Hoang Giang Bui <
> >>> hgbk2...@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> Dear Matt/Barry
> >>>>>>>>>
> >>>>>>>>> With your options, it results in
> >>>>>>>>>
> >>>>>>>>>  0 KSP preconditioned resid norm 1.106709687386e+31 true
> >>> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
> >>>>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>>>    0 KSP Residual norm 2.407308987203e+36
> >>>>>>>>>    1 KSP Residual norm 5.797185652683e+72
> >>>>>>>>
> >>>>>>>> It looks like Matt is right, hypre is seemly producing useless
> >>> garbage.
> >>>>>>>>
> >>>>>>>> First how do things run on one process. If you have similar
> >>> problems then debug on one process (debugging any kind of problem is
> always
> >>> far easy on one process).
> >>>>>>>>
> >>>>>>>> First run with -fieldsplit_u_type lu (instead of using hypre) to
> >>> see if that works or also produces something bad.
> >>>>>>>>
> >>>>>>>> What is the operator and the boundary conditions for u? It could
> >>> be singular.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>>>>>> ...
> >>>>>>>>> 999 KSP preconditioned resid norm 2.920157329174e+12 true
> >>> resid norm 9.015683504616e+06 ||r(i)||/||b|| 1.000059124102e+00
> >>>>>>>>>    Residual norms for fieldsplit_u_ solve.
> >>>>>>>>>    0 KSP Residual norm 1.533726746719e+36
> >>>>>>>>>    1 KSP Residual norm 3.692757392261e+72
> >>>>>>>>>    Residual norms for fieldsplit_wp_ solve.
> >>>>>>>>>    0 KSP Residual norm 0.000000000000e+00
> >>>>>>>>>
> >>>>>>>>> Do you suggest that the pastix solver for the "wp" block
> >>> encounters small pivot? In addition, seem like the "u" block is also
> >>> singular.
> >>>>>>>>>
> >>>>>>>>> Giang
> >>>>>>>>>
> >>>>>>>>> On Sun, Apr 23, 2017 at 7:39 PM, Barry Smith <
> >>> bsm...@mcs.anl.gov> wrote:
> >>>>>>>>>
> >>>>>>>>>   Huge preconditioned norms but normal unpreconditioned norms
> >>> almost always come from a very small pivot in an LU or ILU
> factorization.
> >>>>>>>>>
> >>>>>>>>>   The first thing to do is monitor the two sub solves. Run
> >>> with the additional options -fieldsplit_u_ksp_type richardson
> >>> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1
> >>> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor
> >>> -fieldsplit_wp_ksp_max_it 1
> >>>>>>>>>
> >>>>>>>>>> On Apr 23, 2017, at 12:22 PM, Hoang Giang Bui <
> >>> hgbk2...@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hello
> >>>>>>>>>>
> >>>>>>>>>> I encountered a strange convergence behavior that I have
> >>> trouble to understand
> >>>>>>>>>>
> >>>>>>>>>> KSPSetFromOptions completed
> >>>>>>>>>>  0 KSP preconditioned resid norm 1.106709687386e+31 true
> >>> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00
> >>>>>>>>>>  1 KSP preconditioned resid norm 2.933141742664e+29 true
> >>> resid norm 9.015152282123e+06 ||r(i)||/||b|| 1.000000198575e+00
> >>>>>>>>>>  2 KSP preconditioned resid norm 9.686409637174e+16 true
> >>> resid norm 9.015354521944e+06 ||r(i)||/||b|| 1.000022631902e+00
> >>>>>>>>>>  3 KSP preconditioned resid norm 4.219243615809e+15 true
> >>> resid norm 9.017157702420e+06 ||r(i)||/||b|| 1.000222648583e+00
> >>>>>>>>>> .....
> >>>>>>>>>> 999 KSP preconditioned resid norm 3.043754298076e+12 true
> >>> resid norm 9.015425041089e+06 ||r(i)||/||b|| 1.000030454195e+00
> >>>>>>>>>> 1000 KSP preconditioned resid norm 3.043000287819e+12 true
> >>> resid norm 9.015424313455e+06 ||r(i)||/||b|| 1.000030373483e+00
> >>>>>>>>>> Linear solve did not converge due to DIVERGED_ITS iterations
> >>> 1000
> >>>>>>>>>> KSP Object: 4 MPI processes
> >>>>>>>>>>  type: gmres
> >>>>>>>>>>    GMRES: restart=1000, using Modified Gram-Schmidt
> >>> Orthogonalization
> >>>>>>>>>>    GMRES: happy breakdown tolerance 1e-30
> >>>>>>>>>>  maximum iterations=1000, initial guess is zero
> >>>>>>>>>>  tolerances:  relative=1e-20, absolute=1e-09,
> >>> divergence=10000
> >>>>>>>>>>  left preconditioning
> >>>>>>>>>>  using PRECONDITIONED norm type for convergence test
> >>>>>>>>>> PC Object: 4 MPI processes
> >>>>>>>>>>  type: fieldsplit
> >>>>>>>>>>    FieldSplit with MULTIPLICATIVE composition: total splits
> >>> = 2
> >>>>>>>>>>    Solver info for each split is in the following KSP
> >>> objects:
> >>>>>>>>>>    Split number 0 Defined by IS
> >>>>>>>>>>    KSP Object:    (fieldsplit_u_)     4 MPI processes
> >>>>>>>>>>      type: preonly
> >>>>>>>>>>      maximum iterations=10000, initial guess is zero
> >>>>>>>>>>      tolerances:  relative=1e-05, absolute=1e-50,
> >>> divergence=10000
> >>>>>>>>>>      left preconditioning
> >>>>>>>>>>      using NONE norm type for convergence test
> >>>>>>>>>>    PC Object:    (fieldsplit_u_)     4 MPI processes
> >>>>>>>>>>      type: hypre
> >>>>>>>>>>        HYPRE BoomerAMG preconditioning
> >>>>>>>>>>        HYPRE BoomerAMG: Cycle type V
> >>>>>>>>>>        HYPRE BoomerAMG: Maximum number of levels 25
> >>>>>>>>>>        HYPRE BoomerAMG: Maximum number of iterations PER
> >>> hypre call 1
> >>>>>>>>>>        HYPRE BoomerAMG: Convergence tolerance PER hypre
> >>> call 0
> >>>>>>>>>>        HYPRE BoomerAMG: Threshold for strong coupling 0.6
> >>>>>>>>>>        HYPRE BoomerAMG: Interpolation truncation factor 0
> >>>>>>>>>>        HYPRE BoomerAMG: Interpolation: max elements per row
> >>> 0
> >>>>>>>>>>        HYPRE BoomerAMG: Number of levels of aggressive
> >>> coarsening 0
> >>>>>>>>>>        HYPRE BoomerAMG: Number of paths for aggressive
> >>> coarsening 1
> >>>>>>>>>>        HYPRE BoomerAMG: Maximum row sums 0.9
> >>>>>>>>>>        HYPRE BoomerAMG: Sweeps down         1
> >>>>>>>>>>        HYPRE BoomerAMG: Sweeps up           1
> >>>>>>>>>>        HYPRE BoomerAMG: Sweeps on coarse    1
> >>>>>>>>>>        HYPRE BoomerAMG: Relax down
> >>> symmetric-SOR/Jacobi
> >>>>>>>>>>        HYPRE BoomerAMG: Relax up
> >>> symmetric-SOR/Jacobi
> >>>>>>>>>>        HYPRE BoomerAMG: Relax on coarse
> >>> Gaussian-elimination
> >>>>>>>>>>        HYPRE BoomerAMG: Relax weight  (all)      1
> >>>>>>>>>>        HYPRE BoomerAMG: Outer relax weight (all) 1
> >>>>>>>>>>        HYPRE BoomerAMG: Using CF-relaxation
> >>>>>>>>>>        HYPRE BoomerAMG: Measure type        local
> >>>>>>>>>>        HYPRE BoomerAMG: Coarsen type        PMIS
> >>>>>>>>>>        HYPRE BoomerAMG: Interpolation type  classical
> >>>>>>>>>>      linear system matrix = precond matrix:
> >>>>>>>>>>      Mat Object:      (fieldsplit_u_)       4 MPI processes
> >>>>>>>>>>        type: mpiaij
> >>>>>>>>>>        rows=938910, cols=938910, bs=3
> >>>>>>>>>>        total: nonzeros=8.60906e+07, allocated
> >>> nonzeros=8.60906e+07
> >>>>>>>>>>        total number of mallocs used during MatSetValues
> >>> calls =0
> >>>>>>>>>>          using I-node (on process 0) routines: found 78749
> >>> nodes, limit used is 5
> >>>>>>>>>>    Split number 1 Defined by IS
> >>>>>>>>>>    KSP Object:    (fieldsplit_wp_)     4 MPI processes
> >>>>>>>>>>      type: preonly
> >>>>>>>>>>      maximum iterations=10000, initial guess is zero
> >>>>>>>>>>      tolerances:  relative=1e-05, absolute=1e-50,
> >>> divergence=10000
> >>>>>>>>>>      left preconditioning
> >>>>>>>>>>      using NONE norm type for convergence test
> >>>>>>>>>>    PC Object:    (fieldsplit_wp_)     4 MPI processes
> >>>>>>>>>>      type: lu
> >>>>>>>>>>        LU: out-of-place factorization
> >>>>>>>>>>        tolerance for zero pivot 2.22045e-14
> >>>>>>>>>>        matrix ordering: natural
> >>>>>>>>>>        factor fill ratio given 0, needed 0
> >>>>>>>>>>          Factored matrix follows:
> >>>>>>>>>>            Mat Object:             4 MPI processes
> >>>>>>>>>>              type: mpiaij
> >>>>>>>>>>              rows=34141, cols=34141
> >>>>>>>>>>              package used to perform factorization: pastix
> >>>>>>>>>>            Error :                        -nan
> >>>>>>>>>>  Error :                        -nan
> >>>>>>>>>>    total: nonzeros=0, allocated nonzeros=0
> >>>>>>>>>>            Error :                        -nan
> >>>>>>>>>>    total number of mallocs used during MatSetValues calls =0
> >>>>>>>>>>                PaStiX run parameters:
> >>>>>>>>>>                  Matrix type :
> >>> Symmetric
> >>>>>>>>>>                  Level of printing (0,1,2):         0
> >>>>>>>>>>                  Number of refinements iterations : 0
> >>>>>>>>>>  Error :                        -nan
> >>>>>>>>>>      linear system matrix = precond matrix:
> >>>>>>>>>>      Mat Object:      (fieldsplit_wp_)       4 MPI processes
> >>>>>>>>>>        type: mpiaij
> >>>>>>>>>>        rows=34141, cols=34141
> >>>>>>>>>>        total: nonzeros=485655, allocated nonzeros=485655
> >>>>>>>>>>        total number of mallocs used during MatSetValues
> >>> calls =0
> >>>>>>>>>>          not using I-node (on process 0) routines
> >>>>>>>>>>  linear system matrix = precond matrix:
> >>>>>>>>>>  Mat Object:   4 MPI processes
> >>>>>>>>>>    type: mpiaij
> >>>>>>>>>>    rows=973051, cols=973051
> >>>>>>>>>>    total: nonzeros=9.90037e+07, allocated
> >>> nonzeros=9.90037e+07
> >>>>>>>>>>    total number of mallocs used during MatSetValues calls =0
> >>>>>>>>>>      using I-node (on process 0) routines: found 78749
> >>> nodes, limit used is 5
> >>>>>>>>>>
> >>>>>>>>>> The pattern of convergence gives a hint that this system is
> >>> somehow bad/singular. But I don't know why the preconditioned error
> goes up
> >>> too high. Anyone has an idea?
> >>>>>>>>>>
> >>>>>>>>>> Best regards
> >>>>>>>>>> Giang Bui
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
>
>

Re: [petsc-users] strange convergence

Reply via email to