Thanks Barry Running with that option gives the output for the first solve:
BoomerAMG SETUP PARAMETERS: Max levels = 25 Num levels = 7 Strength Threshold = 0.100000 Interpolation Truncation Factor = 0.000000 Maximum Row Sum Threshold for Dependency Weakening = 0.900000 Coarsening Type = PMIS measures are determined locally No global partition option chosen. Interpolation = modified classical interpolation Operator Matrix Information: nonzero entries per row row sums lev rows entries sparse min max avg min max =================================================================== 0 1056957 109424691 0.000 30 1617 103.5 -2.075e+11 3.561e+11 1 185483 33504881 0.001 17 713 180.6 -3.493e+11 1.323e+13 2 26295 4691629 0.007 17 513 178.4 -3.367e+10 6.960e+12 3 3438 432138 0.037 24 295 125.7 -2.194e+10 2.154e+11 4 476 34182 0.151 8 192 71.8 -6.435e+09 2.306e+11 5 84 2410 0.342 8 70 28.7 -1.052e+07 6.640e+10 6 18 252 0.778 10 18 14.0 9.038e+06 8.828e+10 Interpolation Matrix Information: entries/row min max row sums lev rows cols min max weight weight min max ================================================================= 0 1056957 x 185483 0 18 -1.143e+02 7.741e+01 -1.143e+02 7.741e+01 1 185483 x 26295 0 15 -1.053e+01 2.918e+00 -1.053e+01 2.918e+00 2 26295 x 3438 0 9 1.308e-02 1.036e+00 0.000e+00 1.058e+00 3 3438 x 476 0 7 1.782e-02 1.015e+00 0.000e+00 1.015e+00 4 476 x 84 0 5 1.378e-02 1.000e+00 0.000e+00 1.000e+00 5 84 x 18 0 3 1.330e-02 1.000e+00 0.000e+00 1.000e+00 Complexity: grid = 1.204165 operator = 1.353353 memory = 1.381360 BoomerAMG SOLVER PARAMETERS: Maximum number of cycles: 1 Stopping Tolerance: 0.000000e+00 Cycle type (1 = V, 2 = W, etc.): 1 Relaxation Parameters: Visiting Grid: down up coarse Number of sweeps: 1 1 1 Type 0=Jac, 3=hGS, 6=hSGS, 9=GE: 6 6 6 Point types, partial sweeps (1=C, -1=F): Pre-CG relaxation (down): 1 -1 Post-CG relaxation (up): -1 1 Coarsest grid: 0 Output flag (print_level): 3 relative residual factor residual -------- ------ -------- Initial 9.006493e+06 1.000000e+00 Cycle 1 7.994266e+06 0.887611 <08876%2011> 8.876114e-01 Average Convergence Factor = 0.887611 <08876%2011> Complexity: grid = 1.204165 operator = 1.353353 cycle = 2.706703 KSP Object:(fieldsplit_u_) 8 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object:(fieldsplit_u_) 8 MPI processes type: hypre HYPRE BoomerAMG preconditioning HYPRE BoomerAMG: Cycle type V HYPRE BoomerAMG: Maximum number of levels 25 HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 HYPRE BoomerAMG: Threshold for strong coupling 0.1 HYPRE BoomerAMG: Interpolation truncation factor 0 HYPRE BoomerAMG: Interpolation: max elements per row 0 HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 HYPRE BoomerAMG: Maximum row sums 0.9 HYPRE BoomerAMG: Sweeps down 1 HYPRE BoomerAMG: Sweeps up 1 HYPRE BoomerAMG: Sweeps on coarse 1 HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax on coarse Gaussian-elimination HYPRE BoomerAMG: Relax weight (all) 1 HYPRE BoomerAMG: Outer relax weight (all) 1 HYPRE BoomerAMG: Using CF-relaxation HYPRE BoomerAMG: Measure type local HYPRE BoomerAMG: Coarsen type PMIS HYPRE BoomerAMG: Interpolation type classical linear system matrix = precond matrix: Mat Object: (fieldsplit_u_) 8 MPI processes type: mpiaij rows=1056957, cols=1056957, bs=3 total: nonzeros=1.09425e+08, allocated nonzeros=1.09425e+08 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 43537 nodes, limit used is 5 0 KSP preconditioned resid norm 4.076033642262e+00 true resid norm 9.006493083033e+06 ||r(i)||/||b|| 1.000000000000e+00 Giang On Sat, Apr 29, 2017 at 8:06 PM, Barry Smith <bsm...@mcs.anl.gov> wrote: > > > On Apr 29, 2017, at 8:34 AM, Jed Brown <j...@jedbrown.org> wrote: > > > > Hoang Giang Bui <hgbk2...@gmail.com> writes: > > > >> Hi Barry > >> > >> The first block is from a standard solid mechanics discretization based > on > >> balance of momentum equation. There is some material involved but in > >> principal it's well-posed elasticity equation with positive definite > >> tangent operator. The "gluing business" uses the mortar method to keep > the > >> continuity of displacement. Instead of using Lagrange multiplier to > treat > >> the constraint I used penalty method to penalize the energy. The > >> discretization form of mortar is quite simple > >> > >> \int_{\Gamma_1} { rho * (\delta u_1 - \delta u_2) * (u_1 - u_2) dA } > >> > >> rho is penalty parameter. In the simulation I initially set it low (~E) > to > >> preserve the conditioning of the system. > > > > There are two things that can go wrong here with AMG: > > > > * The penalty term can mess up the strength of connection heuristics > > such that you get poor choice of C-points (classical AMG like > > BoomerAMG) or poor choice of aggregates (smoothed aggregation). > > > > * The penalty term can prevent Jacobi smoothing from being effective; in > > this case, it can lead to poor coarse basis functions (higher energy > > than they should be) and poor smoothing in an MG cycle. You can fix > > the poor smoothing in the MG cycle by using a stronger smoother, like > > ASM with some overlap. > > > > I'm generally not a fan of penalty methods due to the irritating > > tradeoffs and often poor solver performance. > > So, let's first see what hypre BoomerAMG is doing with the system. Run > for just one BoomerAMG solve with the additional options > > -fieldsplit_u_ksp_view -fieldsplit_u_pc_hypre_boomeramg_print_statistics > > this should print a good amount of information of what BoomerAMG has > decided to do based on the input matrix. > > I'm bringing the hypre team into the conversation since they obviously > know far more about BoomerAMG tuning options that may help your case. > > Barry > > > > > > >> In the figure below, the colorful blocks are u_1 and the base is u_2. > Both > >> u_1 and u_2 use isoparametric quadratic approximation. > >> > >> > >> Snapshot.png > >> <https://drive.google.com/file/d/0Bw8Hmu0-YGQXc2hKQ1BhQ1I4OE > U/view?usp=drive_web> > >> > >> > >> Giang > >> > >> On Fri, Apr 28, 2017 at 6:21 PM, Barry Smith <bsm...@mcs.anl.gov> > wrote: > >> > >>> > >>> Ok, so boomerAMG algebraic multigrid is not good for the first block. > >>> You mentioned the first block has two things glued together? AMG is > >>> fantastic for certain problems but doesn't work for everything. > >>> > >>> Tell us more about the first block, what PDE it comes from, what > >>> discretization, and what the "gluing business" is and maybe we'll have > >>> suggestions for how to precondition it. > >>> > >>> Barry > >>> > >>>> On Apr 28, 2017, at 3:56 AM, Hoang Giang Bui <hgbk2...@gmail.com> > wrote: > >>>> > >>>> It's in fact quite good > >>>> > >>>> Residual norms for fieldsplit_u_ solve. > >>>> 0 KSP Residual norm 4.014715925568e+00 > >>>> 1 KSP Residual norm 2.160497019264e-10 > >>>> Residual norms for fieldsplit_wp_ solve. > >>>> 0 KSP Residual norm 0.000000000000e+00 > >>>> 0 KSP preconditioned resid norm 4.014715925568e+00 true resid norm > >>> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00 > >>>> Residual norms for fieldsplit_u_ solve. > >>>> 0 KSP Residual norm 9.999999999416e-01 > >>>> 1 KSP Residual norm 7.118380416383e-11 > >>>> Residual norms for fieldsplit_wp_ solve. > >>>> 0 KSP Residual norm 0.000000000000e+00 > >>>> 1 KSP preconditioned resid norm 1.701150951035e-10 true resid norm > >>> 5.494262251846e-04 ||r(i)||/||b|| 6.100334726599e-11 > >>>> Linear solve converged due to CONVERGED_ATOL iterations 1 > >>>> > >>>> Giang > >>>> > >>>> On Thu, Apr 27, 2017 at 5:25 PM, Barry Smith <bsm...@mcs.anl.gov> > wrote: > >>>> > >>>> Run again using LU on both blocks to see what happens. > >>>> > >>>> > >>>>> On Apr 27, 2017, at 2:14 AM, Hoang Giang Bui <hgbk2...@gmail.com> > >>> wrote: > >>>>> > >>>>> I have changed the way to tie the nonconforming mesh. It seems the > >>> matrix now is better > >>>>> > >>>>> with -pc_type lu the output is > >>>>> 0 KSP preconditioned resid norm 3.308678584240e-01 true resid norm > >>> 9.006493082896e+06 ||r(i)||/||b|| 1.000000000000e+00 > >>>>> 1 KSP preconditioned resid norm 2.004313395301e-12 true resid norm > >>> 2.549872332830e-05 ||r(i)||/||b|| 2.831148938173e-12 > >>>>> Linear solve converged due to CONVERGED_ATOL iterations 1 > >>>>> > >>>>> > >>>>> with -pc_type fieldsplit -fieldsplit_u_pc_type hypre > >>> -fieldsplit_wp_pc_type lu the convergence is slow > >>>>> 0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm > >>> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00 > >>>>> 1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm > >>> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00 > >>>>> ... > >>>>> 824 KSP preconditioned resid norm 1.018542387738e-09 true resid norm > >>> 2.906608839310e+02 ||r(i)||/||b|| 3.227237074804e-05 > >>>>> 825 KSP preconditioned resid norm 9.743727947637e-10 true resid norm > >>> 2.820369993061e+02 ||r(i)||/||b|| 3.131485215062e-05 > >>>>> Linear solve converged due to CONVERGED_ATOL iterations 825 > >>>>> > >>>>> checking with additional -fieldsplit_u_ksp_type richardson > >>> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1 > >>> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor > >>> -fieldsplit_wp_ksp_max_it 1 gives > >>>>> > >>>>> 0 KSP preconditioned resid norm 1.116302362553e-01 true resid norm > >>> 9.006493083520e+06 ||r(i)||/||b|| 1.000000000000e+00 > >>>>> Residual norms for fieldsplit_u_ solve. > >>>>> 0 KSP Residual norm 5.803507549280e-01 > >>>>> 1 KSP Residual norm 2.069538175950e-01 > >>>>> Residual norms for fieldsplit_wp_ solve. > >>>>> 0 KSP Residual norm 0.000000000000e+00 > >>>>> 1 KSP preconditioned resid norm 2.582134825666e-02 true resid norm > >>> 9.268347719866e+06 ||r(i)||/||b|| 1.029073984060e+00 > >>>>> Residual norms for fieldsplit_u_ solve. > >>>>> 0 KSP Residual norm 7.831796195225e-01 > >>>>> 1 KSP Residual norm 1.734608520110e-01 > >>>>> Residual norms for fieldsplit_wp_ solve. > >>>>> 0 KSP Residual norm 0.000000000000e+00 > >>>>> .... > >>>>> 823 KSP preconditioned resid norm 1.065070135605e-09 true resid norm > >>> 3.081881356833e+02 ||r(i)||/||b|| 3.421843916665e-05 > >>>>> Residual norms for fieldsplit_u_ solve. > >>>>> 0 KSP Residual norm 6.113806394327e-01 > >>>>> 1 KSP Residual norm 1.535465290944e-01 > >>>>> Residual norms for fieldsplit_wp_ solve. > >>>>> 0 KSP Residual norm 0.000000000000e+00 > >>>>> 824 KSP preconditioned resid norm 1.018542387746e-09 true resid norm > >>> 2.906608839353e+02 ||r(i)||/||b|| 3.227237074851e-05 > >>>>> Residual norms for fieldsplit_u_ solve. > >>>>> 0 KSP Residual norm 6.123437055586e-01 > >>>>> 1 KSP Residual norm 1.524661826133e-01 > >>>>> Residual norms for fieldsplit_wp_ solve. > >>>>> 0 KSP Residual norm 0.000000000000e+00 > >>>>> 825 KSP preconditioned resid norm 9.743727947718e-10 true resid norm > >>> 2.820369990571e+02 ||r(i)||/||b|| 3.131485212298e-05 > >>>>> Linear solve converged due to CONVERGED_ATOL iterations 825 > >>>>> > >>>>> > >>>>> The residual for wp block is zero since in this first step the rhs is > >>> zero. As can see in the output, the multigrid does not perform well to > >>> reduce the residual in the sub-solve. Is my observation right? what > can be > >>> done to improve this? > >>>>> > >>>>> > >>>>> Giang > >>>>> > >>>>> On Tue, Apr 25, 2017 at 12:17 AM, Barry Smith <bsm...@mcs.anl.gov> > >>> wrote: > >>>>> > >>>>> This can happen in the matrix is singular or nearly singular or if > >>> the factorization generates small pivots, which can occur for even > >>> nonsingular problems if the matrix is poorly scaled or just plain > nasty. > >>>>> > >>>>> > >>>>>> On Apr 24, 2017, at 5:10 PM, Hoang Giang Bui <hgbk2...@gmail.com> > >>> wrote: > >>>>>> > >>>>>> It took a while, here I send you the output > >>>>>> > >>>>>> 0 KSP preconditioned resid norm 3.129073545457e+05 true resid norm > >>> 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00 > >>>>>> 1 KSP preconditioned resid norm 7.442444222843e-01 true resid norm > >>> 1.003356247696e+02 ||r(i)||/||b|| 1.112966720375e-05 > >>>>>> 2 KSP preconditioned resid norm 3.267453132529e-07 true resid norm > >>> 3.216722968300e+01 ||r(i)||/||b|| 3.568130084011e-06 > >>>>>> 3 KSP preconditioned resid norm 1.155046883816e-11 true resid norm > >>> 3.234460376820e+01 ||r(i)||/||b|| 3.587805194854e-06 > >>>>>> Linear solve converged due to CONVERGED_ATOL iterations 3 > >>>>>> KSP Object: 4 MPI processes > >>>>>> type: gmres > >>>>>> GMRES: restart=1000, using Modified Gram-Schmidt > >>> Orthogonalization > >>>>>> GMRES: happy breakdown tolerance 1e-30 > >>>>>> maximum iterations=1000, initial guess is zero > >>>>>> tolerances: relative=1e-20, absolute=1e-09, divergence=10000 > >>>>>> left preconditioning > >>>>>> using PRECONDITIONED norm type for convergence test > >>>>>> PC Object: 4 MPI processes > >>>>>> type: lu > >>>>>> LU: out-of-place factorization > >>>>>> tolerance for zero pivot 2.22045e-14 > >>>>>> matrix ordering: natural > >>>>>> factor fill ratio given 0, needed 0 > >>>>>> Factored matrix follows: > >>>>>> Mat Object: 4 MPI processes > >>>>>> type: mpiaij > >>>>>> rows=973051, cols=973051 > >>>>>> package used to perform factorization: pastix > >>>>>> Error : 3.24786e-14 > >>>>>> total: nonzeros=0, allocated nonzeros=0 > >>>>>> total number of mallocs used during MatSetValues calls =0 > >>>>>> PaStiX run parameters: > >>>>>> Matrix type : Unsymmetric > >>>>>> Level of printing (0,1,2): 0 > >>>>>> Number of refinements iterations : 3 > >>>>>> Error : 3.24786e-14 > >>>>>> linear system matrix = precond matrix: > >>>>>> Mat Object: 4 MPI processes > >>>>>> type: mpiaij > >>>>>> rows=973051, cols=973051 > >>>>>> Error : 3.24786e-14 > >>>>>> total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07 > >>>>>> total number of mallocs used during MatSetValues calls =0 > >>>>>> using I-node (on process 0) routines: found 78749 nodes, limit > >>> used is 5 > >>>>>> Error : 3.24786e-14 > >>>>>> > >>>>>> It doesn't do as you said. Something is not right here. I will look > >>> in depth. > >>>>>> > >>>>>> Giang > >>>>>> > >>>>>> On Mon, Apr 24, 2017 at 8:21 PM, Barry Smith <bsm...@mcs.anl.gov> > >>> wrote: > >>>>>> > >>>>>>> On Apr 24, 2017, at 12:47 PM, Hoang Giang Bui <hgbk2...@gmail.com> > >>> wrote: > >>>>>>> > >>>>>>> Good catch. I get this for the very first step, maybe at that time > >>> the rhs_w is zero. > >>>>>> > >>>>>> With the multiplicative composition the right hand side of the > >>> second solve is the initial right hand side of the second solve minus > >>> A_10*x where x is the solution to the first sub solve and A_10 is the > lower > >>> left block of the outer matrix. So unless both the initial right hand > side > >>> has a zero for the second block and A_10 is identically zero the right > hand > >>> side for the second sub solve should not be zero. Is A_10 == 0? > >>>>>> > >>>>>> > >>>>>>> In the later step, it shows 2 step convergence > >>>>>>> > >>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>> 0 KSP Residual norm 3.165886479830e+04 > >>>>>>> 1 KSP Residual norm 2.905922877684e-01 > >>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>> 0 KSP Residual norm 2.397669419027e-01 > >>>>>>> 1 KSP Residual norm 0.000000000000e+00 > >>>>>>> 0 KSP preconditioned resid norm 3.165886479920e+04 true resid > >>> norm 7.963616922323e+05 ||r(i)||/||b|| 1.000000000000e+00 > >>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>> 0 KSP Residual norm 9.999891813771e-01 > >>>>>>> 1 KSP Residual norm 1.512000395579e-05 > >>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>> 0 KSP Residual norm 8.192702188243e-06 > >>>>>>> 1 KSP Residual norm 0.000000000000e+00 > >>>>>>> 1 KSP preconditioned resid norm 5.252183822848e-02 true resid > >>> norm 7.135927677844e+04 ||r(i)||/||b|| 8.960661653427e-02 > >>>>>> > >>>>>> The outer residual norms are still wonky, the preconditioned > >>> residual norm goes from 3.165886479920e+04 to 5.252183822848e-02 which > is a > >>> huge drop but the 7.963616922323e+05 drops very much less > >>> 7.135927677844e+04. This is not normal. > >>>>>> > >>>>>> What if you just use -pc_type lu for the entire system (no > >>> fieldsplit), does the true residual drop to almost zero in the first > >>> iteration (as it should?). Send the output. > >>>>>> > >>>>>> > >>>>>> > >>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>> 0 KSP Residual norm 6.946213936597e-01 > >>>>>>> 1 KSP Residual norm 1.195514007343e-05 > >>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>> 0 KSP Residual norm 1.025694497535e+00 > >>>>>>> 1 KSP Residual norm 0.000000000000e+00 > >>>>>>> 2 KSP preconditioned resid norm 8.785709535405e-03 true resid > >>> norm 1.419341799277e+04 ||r(i)||/||b|| 1.782282866091e-02 > >>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>> 0 KSP Residual norm 7.255149996405e-01 > >>>>>>> 1 KSP Residual norm 6.583512434218e-06 > >>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>> 0 KSP Residual norm 1.015229700337e+00 > >>>>>>> 1 KSP Residual norm 0.000000000000e+00 > >>>>>>> 3 KSP preconditioned resid norm 7.110407712709e-04 true resid > >>> norm 5.284940654154e+02 ||r(i)||/||b|| 6.636357205153e-04 > >>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>> 0 KSP Residual norm 3.512243341400e-01 > >>>>>>> 1 KSP Residual norm 2.032490351200e-06 > >>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>> 0 KSP Residual norm 1.282327290982e+00 > >>>>>>> 1 KSP Residual norm 0.000000000000e+00 > >>>>>>> 4 KSP preconditioned resid norm 3.482036620521e-05 true resid > >>> norm 4.291231924307e+01 ||r(i)||/||b|| 5.388546393133e-05 > >>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>> 0 KSP Residual norm 3.423609338053e-01 > >>>>>>> 1 KSP Residual norm 4.213703301972e-07 > >>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>> 0 KSP Residual norm 1.157384757538e+00 > >>>>>>> 1 KSP Residual norm 0.000000000000e+00 > >>>>>>> 5 KSP preconditioned resid norm 1.203470314534e-06 true resid > >>> norm 4.544956156267e+00 ||r(i)||/||b|| 5.707150658550e-06 > >>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>> 0 KSP Residual norm 3.838596289995e-01 > >>>>>>> 1 KSP Residual norm 9.927864176103e-08 > >>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>> 0 KSP Residual norm 1.066298905618e+00 > >>>>>>> 1 KSP Residual norm 0.000000000000e+00 > >>>>>>> 6 KSP preconditioned resid norm 3.331619244266e-08 true resid > >>> norm 2.821511729024e+00 ||r(i)||/||b|| 3.543002829675e-06 > >>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>> 0 KSP Residual norm 4.624964188094e-01 > >>>>>>> 1 KSP Residual norm 6.418229775372e-08 > >>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>> 0 KSP Residual norm 9.800784311614e-01 > >>>>>>> 1 KSP Residual norm 0.000000000000e+00 > >>>>>>> 7 KSP preconditioned resid norm 8.788046233297e-10 true resid > >>> norm 2.849209671705e+00 ||r(i)||/||b|| 3.577783436215e-06 > >>>>>>> Linear solve converged due to CONVERGED_ATOL iterations 7 > >>>>>>> > >>>>>>> The outer operator is an explicit matrix. > >>>>>>> > >>>>>>> Giang > >>>>>>> > >>>>>>> On Mon, Apr 24, 2017 at 7:32 PM, Barry Smith <bsm...@mcs.anl.gov> > >>> wrote: > >>>>>>> > >>>>>>>> On Apr 24, 2017, at 3:16 AM, Hoang Giang Bui <hgbk2...@gmail.com> > >>> wrote: > >>>>>>>> > >>>>>>>> Thanks Barry, trying with -fieldsplit_u_type lu gives better > >>> convergence. I still used 4 procs though, probably with 1 proc it > should > >>> also be the same. > >>>>>>>> > >>>>>>>> The u block used a Nitsche-type operator to connect two > >>> non-matching domains. I don't think it will leave some rigid body > motion > >>> leads to not sufficient constraints. Maybe you have other idea? > >>>>>>>> > >>>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>>> 0 KSP Residual norm 3.129067184300e+05 > >>>>>>>> 1 KSP Residual norm 5.906261468196e-01 > >>>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>>> 0 KSP Residual norm 0.000000000000e+00 > >>>>>>> > >>>>>>> ^^^^ something is wrong here. The sub solve should not be > >>> starting with a 0 residual (this means the right hand side for this sub > >>> solve is zero which it should not be). > >>>>>>> > >>>>>>>> FieldSplit with MULTIPLICATIVE composition: total splits = 2 > >>>>>>> > >>>>>>> > >>>>>>> How are you providing the outer operator? As an explicit matrix > >>> or with some shell matrix? > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> 0 KSP preconditioned resid norm 3.129067184300e+05 true resid > >>> norm 9.015150492169e+06 ||r(i)||/||b|| 1.000000000000e+00 > >>>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>>> 0 KSP Residual norm 9.999955993437e-01 > >>>>>>>> 1 KSP Residual norm 4.019774691831e-06 > >>>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>>> 0 KSP Residual norm 0.000000000000e+00 > >>>>>>>> 1 KSP preconditioned resid norm 5.003913641475e-01 true resid > >>> norm 4.692996324114e+01 ||r(i)||/||b|| 5.205677185522e-06 > >>>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>>> 0 KSP Residual norm 1.000012180204e+00 > >>>>>>>> 1 KSP Residual norm 1.017367950422e-05 > >>>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>>> 0 KSP Residual norm 0.000000000000e+00 > >>>>>>>> 2 KSP preconditioned resid norm 2.330910333756e-07 true resid > >>> norm 3.474855463983e+01 ||r(i)||/||b|| 3.854461960453e-06 > >>>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>>> 0 KSP Residual norm 1.000004200085e+00 > >>>>>>>> 1 KSP Residual norm 6.231613102458e-06 > >>>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>>> 0 KSP Residual norm 0.000000000000e+00 > >>>>>>>> 3 KSP preconditioned resid norm 8.671259838389e-11 true resid > >>> norm 3.545103468011e+01 ||r(i)||/||b|| 3.932384125024e-06 > >>>>>>>> Linear solve converged due to CONVERGED_ATOL iterations 3 > >>>>>>>> KSP Object: 4 MPI processes > >>>>>>>> type: gmres > >>>>>>>> GMRES: restart=1000, using Modified Gram-Schmidt > >>> Orthogonalization > >>>>>>>> GMRES: happy breakdown tolerance 1e-30 > >>>>>>>> maximum iterations=1000, initial guess is zero > >>>>>>>> tolerances: relative=1e-20, absolute=1e-09, divergence=10000 > >>>>>>>> left preconditioning > >>>>>>>> using PRECONDITIONED norm type for convergence test > >>>>>>>> PC Object: 4 MPI processes > >>>>>>>> type: fieldsplit > >>>>>>>> FieldSplit with MULTIPLICATIVE composition: total splits = 2 > >>>>>>>> Solver info for each split is in the following KSP objects: > >>>>>>>> Split number 0 Defined by IS > >>>>>>>> KSP Object: (fieldsplit_u_) 4 MPI processes > >>>>>>>> type: richardson > >>>>>>>> Richardson: damping factor=1 > >>>>>>>> maximum iterations=1, initial guess is zero > >>>>>>>> tolerances: relative=1e-05, absolute=1e-50, > >>> divergence=10000 > >>>>>>>> left preconditioning > >>>>>>>> using PRECONDITIONED norm type for convergence test > >>>>>>>> PC Object: (fieldsplit_u_) 4 MPI processes > >>>>>>>> type: lu > >>>>>>>> LU: out-of-place factorization > >>>>>>>> tolerance for zero pivot 2.22045e-14 > >>>>>>>> matrix ordering: natural > >>>>>>>> factor fill ratio given 0, needed 0 > >>>>>>>> Factored matrix follows: > >>>>>>>> Mat Object: 4 MPI processes > >>>>>>>> type: mpiaij > >>>>>>>> rows=938910, cols=938910 > >>>>>>>> package used to perform factorization: pastix > >>>>>>>> total: nonzeros=0, allocated nonzeros=0 > >>>>>>>> Error : 3.36878e-14 > >>>>>>>> total number of mallocs used during MatSetValues calls > >>> =0 > >>>>>>>> PaStiX run parameters: > >>>>>>>> Matrix type : Unsymmetric > >>>>>>>> Level of printing (0,1,2): 0 > >>>>>>>> Number of refinements iterations : 3 > >>>>>>>> Error : 3.36878e-14 > >>>>>>>> linear system matrix = precond matrix: > >>>>>>>> Mat Object: (fieldsplit_u_) 4 MPI processes > >>>>>>>> type: mpiaij > >>>>>>>> rows=938910, cols=938910, bs=3 > >>>>>>>> Error : 3.36878e-14 > >>>>>>>> Error : 3.36878e-14 > >>>>>>>> total: nonzeros=8.60906e+07, allocated > >>> nonzeros=8.60906e+07 > >>>>>>>> total number of mallocs used during MatSetValues calls =0 > >>>>>>>> using I-node (on process 0) routines: found 78749 > >>> nodes, limit used is 5 > >>>>>>>> Split number 1 Defined by IS > >>>>>>>> KSP Object: (fieldsplit_wp_) 4 MPI processes > >>>>>>>> type: richardson > >>>>>>>> Richardson: damping factor=1 > >>>>>>>> maximum iterations=1, initial guess is zero > >>>>>>>> tolerances: relative=1e-05, absolute=1e-50, > >>> divergence=10000 > >>>>>>>> left preconditioning > >>>>>>>> using PRECONDITIONED norm type for convergence test > >>>>>>>> PC Object: (fieldsplit_wp_) 4 MPI processes > >>>>>>>> type: lu > >>>>>>>> LU: out-of-place factorization > >>>>>>>> tolerance for zero pivot 2.22045e-14 > >>>>>>>> matrix ordering: natural > >>>>>>>> factor fill ratio given 0, needed 0 > >>>>>>>> Factored matrix follows: > >>>>>>>> Mat Object: 4 MPI processes > >>>>>>>> type: mpiaij > >>>>>>>> rows=34141, cols=34141 > >>>>>>>> package used to perform factorization: pastix > >>>>>>>> Error : -nan > >>>>>>>> Error : -nan > >>>>>>>> Error : -nan > >>>>>>>> total: nonzeros=0, allocated nonzeros=0 > >>>>>>>> total number of mallocs used during MatSetValues > >>> calls =0 > >>>>>>>> PaStiX run parameters: > >>>>>>>> Matrix type : Symmetric > >>>>>>>> Level of printing (0,1,2): 0 > >>>>>>>> Number of refinements iterations : 0 > >>>>>>>> Error : -nan > >>>>>>>> linear system matrix = precond matrix: > >>>>>>>> Mat Object: (fieldsplit_wp_) 4 MPI processes > >>>>>>>> type: mpiaij > >>>>>>>> rows=34141, cols=34141 > >>>>>>>> total: nonzeros=485655, allocated nonzeros=485655 > >>>>>>>> total number of mallocs used during MatSetValues calls =0 > >>>>>>>> not using I-node (on process 0) routines > >>>>>>>> linear system matrix = precond matrix: > >>>>>>>> Mat Object: 4 MPI processes > >>>>>>>> type: mpiaij > >>>>>>>> rows=973051, cols=973051 > >>>>>>>> total: nonzeros=9.90037e+07, allocated nonzeros=9.90037e+07 > >>>>>>>> total number of mallocs used during MatSetValues calls =0 > >>>>>>>> using I-node (on process 0) routines: found 78749 nodes, > >>> limit used is 5 > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> Giang > >>>>>>>> > >>>>>>>> On Sun, Apr 23, 2017 at 10:19 PM, Barry Smith < > >>> bsm...@mcs.anl.gov> wrote: > >>>>>>>> > >>>>>>>>> On Apr 23, 2017, at 2:42 PM, Hoang Giang Bui < > >>> hgbk2...@gmail.com> wrote: > >>>>>>>>> > >>>>>>>>> Dear Matt/Barry > >>>>>>>>> > >>>>>>>>> With your options, it results in > >>>>>>>>> > >>>>>>>>> 0 KSP preconditioned resid norm 1.106709687386e+31 true > >>> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00 > >>>>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>>>> 0 KSP Residual norm 2.407308987203e+36 > >>>>>>>>> 1 KSP Residual norm 5.797185652683e+72 > >>>>>>>> > >>>>>>>> It looks like Matt is right, hypre is seemly producing useless > >>> garbage. > >>>>>>>> > >>>>>>>> First how do things run on one process. If you have similar > >>> problems then debug on one process (debugging any kind of problem is > always > >>> far easy on one process). > >>>>>>>> > >>>>>>>> First run with -fieldsplit_u_type lu (instead of using hypre) to > >>> see if that works or also produces something bad. > >>>>>>>> > >>>>>>>> What is the operator and the boundary conditions for u? It could > >>> be singular. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>>>> 0 KSP Residual norm 0.000000000000e+00 > >>>>>>>>> ... > >>>>>>>>> 999 KSP preconditioned resid norm 2.920157329174e+12 true > >>> resid norm 9.015683504616e+06 ||r(i)||/||b|| 1.000059124102e+00 > >>>>>>>>> Residual norms for fieldsplit_u_ solve. > >>>>>>>>> 0 KSP Residual norm 1.533726746719e+36 > >>>>>>>>> 1 KSP Residual norm 3.692757392261e+72 > >>>>>>>>> Residual norms for fieldsplit_wp_ solve. > >>>>>>>>> 0 KSP Residual norm 0.000000000000e+00 > >>>>>>>>> > >>>>>>>>> Do you suggest that the pastix solver for the "wp" block > >>> encounters small pivot? In addition, seem like the "u" block is also > >>> singular. > >>>>>>>>> > >>>>>>>>> Giang > >>>>>>>>> > >>>>>>>>> On Sun, Apr 23, 2017 at 7:39 PM, Barry Smith < > >>> bsm...@mcs.anl.gov> wrote: > >>>>>>>>> > >>>>>>>>> Huge preconditioned norms but normal unpreconditioned norms > >>> almost always come from a very small pivot in an LU or ILU > factorization. > >>>>>>>>> > >>>>>>>>> The first thing to do is monitor the two sub solves. Run > >>> with the additional options -fieldsplit_u_ksp_type richardson > >>> -fieldsplit_u_ksp_monitor -fieldsplit_u_ksp_max_it 1 > >>> -fieldsplit_wp_ksp_type richardson -fieldsplit_wp_ksp_monitor > >>> -fieldsplit_wp_ksp_max_it 1 > >>>>>>>>> > >>>>>>>>>> On Apr 23, 2017, at 12:22 PM, Hoang Giang Bui < > >>> hgbk2...@gmail.com> wrote: > >>>>>>>>>> > >>>>>>>>>> Hello > >>>>>>>>>> > >>>>>>>>>> I encountered a strange convergence behavior that I have > >>> trouble to understand > >>>>>>>>>> > >>>>>>>>>> KSPSetFromOptions completed > >>>>>>>>>> 0 KSP preconditioned resid norm 1.106709687386e+31 true > >>> resid norm 9.015150491938e+06 ||r(i)||/||b|| 1.000000000000e+00 > >>>>>>>>>> 1 KSP preconditioned resid norm 2.933141742664e+29 true > >>> resid norm 9.015152282123e+06 ||r(i)||/||b|| 1.000000198575e+00 > >>>>>>>>>> 2 KSP preconditioned resid norm 9.686409637174e+16 true > >>> resid norm 9.015354521944e+06 ||r(i)||/||b|| 1.000022631902e+00 > >>>>>>>>>> 3 KSP preconditioned resid norm 4.219243615809e+15 true > >>> resid norm 9.017157702420e+06 ||r(i)||/||b|| 1.000222648583e+00 > >>>>>>>>>> ..... > >>>>>>>>>> 999 KSP preconditioned resid norm 3.043754298076e+12 true > >>> resid norm 9.015425041089e+06 ||r(i)||/||b|| 1.000030454195e+00 > >>>>>>>>>> 1000 KSP preconditioned resid norm 3.043000287819e+12 true > >>> resid norm 9.015424313455e+06 ||r(i)||/||b|| 1.000030373483e+00 > >>>>>>>>>> Linear solve did not converge due to DIVERGED_ITS iterations > >>> 1000 > >>>>>>>>>> KSP Object: 4 MPI processes > >>>>>>>>>> type: gmres > >>>>>>>>>> GMRES: restart=1000, using Modified Gram-Schmidt > >>> Orthogonalization > >>>>>>>>>> GMRES: happy breakdown tolerance 1e-30 > >>>>>>>>>> maximum iterations=1000, initial guess is zero > >>>>>>>>>> tolerances: relative=1e-20, absolute=1e-09, > >>> divergence=10000 > >>>>>>>>>> left preconditioning > >>>>>>>>>> using PRECONDITIONED norm type for convergence test > >>>>>>>>>> PC Object: 4 MPI processes > >>>>>>>>>> type: fieldsplit > >>>>>>>>>> FieldSplit with MULTIPLICATIVE composition: total splits > >>> = 2 > >>>>>>>>>> Solver info for each split is in the following KSP > >>> objects: > >>>>>>>>>> Split number 0 Defined by IS > >>>>>>>>>> KSP Object: (fieldsplit_u_) 4 MPI processes > >>>>>>>>>> type: preonly > >>>>>>>>>> maximum iterations=10000, initial guess is zero > >>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, > >>> divergence=10000 > >>>>>>>>>> left preconditioning > >>>>>>>>>> using NONE norm type for convergence test > >>>>>>>>>> PC Object: (fieldsplit_u_) 4 MPI processes > >>>>>>>>>> type: hypre > >>>>>>>>>> HYPRE BoomerAMG preconditioning > >>>>>>>>>> HYPRE BoomerAMG: Cycle type V > >>>>>>>>>> HYPRE BoomerAMG: Maximum number of levels 25 > >>>>>>>>>> HYPRE BoomerAMG: Maximum number of iterations PER > >>> hypre call 1 > >>>>>>>>>> HYPRE BoomerAMG: Convergence tolerance PER hypre > >>> call 0 > >>>>>>>>>> HYPRE BoomerAMG: Threshold for strong coupling 0.6 > >>>>>>>>>> HYPRE BoomerAMG: Interpolation truncation factor 0 > >>>>>>>>>> HYPRE BoomerAMG: Interpolation: max elements per row > >>> 0 > >>>>>>>>>> HYPRE BoomerAMG: Number of levels of aggressive > >>> coarsening 0 > >>>>>>>>>> HYPRE BoomerAMG: Number of paths for aggressive > >>> coarsening 1 > >>>>>>>>>> HYPRE BoomerAMG: Maximum row sums 0.9 > >>>>>>>>>> HYPRE BoomerAMG: Sweeps down 1 > >>>>>>>>>> HYPRE BoomerAMG: Sweeps up 1 > >>>>>>>>>> HYPRE BoomerAMG: Sweeps on coarse 1 > >>>>>>>>>> HYPRE BoomerAMG: Relax down > >>> symmetric-SOR/Jacobi > >>>>>>>>>> HYPRE BoomerAMG: Relax up > >>> symmetric-SOR/Jacobi > >>>>>>>>>> HYPRE BoomerAMG: Relax on coarse > >>> Gaussian-elimination > >>>>>>>>>> HYPRE BoomerAMG: Relax weight (all) 1 > >>>>>>>>>> HYPRE BoomerAMG: Outer relax weight (all) 1 > >>>>>>>>>> HYPRE BoomerAMG: Using CF-relaxation > >>>>>>>>>> HYPRE BoomerAMG: Measure type local > >>>>>>>>>> HYPRE BoomerAMG: Coarsen type PMIS > >>>>>>>>>> HYPRE BoomerAMG: Interpolation type classical > >>>>>>>>>> linear system matrix = precond matrix: > >>>>>>>>>> Mat Object: (fieldsplit_u_) 4 MPI processes > >>>>>>>>>> type: mpiaij > >>>>>>>>>> rows=938910, cols=938910, bs=3 > >>>>>>>>>> total: nonzeros=8.60906e+07, allocated > >>> nonzeros=8.60906e+07 > >>>>>>>>>> total number of mallocs used during MatSetValues > >>> calls =0 > >>>>>>>>>> using I-node (on process 0) routines: found 78749 > >>> nodes, limit used is 5 > >>>>>>>>>> Split number 1 Defined by IS > >>>>>>>>>> KSP Object: (fieldsplit_wp_) 4 MPI processes > >>>>>>>>>> type: preonly > >>>>>>>>>> maximum iterations=10000, initial guess is zero > >>>>>>>>>> tolerances: relative=1e-05, absolute=1e-50, > >>> divergence=10000 > >>>>>>>>>> left preconditioning > >>>>>>>>>> using NONE norm type for convergence test > >>>>>>>>>> PC Object: (fieldsplit_wp_) 4 MPI processes > >>>>>>>>>> type: lu > >>>>>>>>>> LU: out-of-place factorization > >>>>>>>>>> tolerance for zero pivot 2.22045e-14 > >>>>>>>>>> matrix ordering: natural > >>>>>>>>>> factor fill ratio given 0, needed 0 > >>>>>>>>>> Factored matrix follows: > >>>>>>>>>> Mat Object: 4 MPI processes > >>>>>>>>>> type: mpiaij > >>>>>>>>>> rows=34141, cols=34141 > >>>>>>>>>> package used to perform factorization: pastix > >>>>>>>>>> Error : -nan > >>>>>>>>>> Error : -nan > >>>>>>>>>> total: nonzeros=0, allocated nonzeros=0 > >>>>>>>>>> Error : -nan > >>>>>>>>>> total number of mallocs used during MatSetValues calls =0 > >>>>>>>>>> PaStiX run parameters: > >>>>>>>>>> Matrix type : > >>> Symmetric > >>>>>>>>>> Level of printing (0,1,2): 0 > >>>>>>>>>> Number of refinements iterations : 0 > >>>>>>>>>> Error : -nan > >>>>>>>>>> linear system matrix = precond matrix: > >>>>>>>>>> Mat Object: (fieldsplit_wp_) 4 MPI processes > >>>>>>>>>> type: mpiaij > >>>>>>>>>> rows=34141, cols=34141 > >>>>>>>>>> total: nonzeros=485655, allocated nonzeros=485655 > >>>>>>>>>> total number of mallocs used during MatSetValues > >>> calls =0 > >>>>>>>>>> not using I-node (on process 0) routines > >>>>>>>>>> linear system matrix = precond matrix: > >>>>>>>>>> Mat Object: 4 MPI processes > >>>>>>>>>> type: mpiaij > >>>>>>>>>> rows=973051, cols=973051 > >>>>>>>>>> total: nonzeros=9.90037e+07, allocated > >>> nonzeros=9.90037e+07 > >>>>>>>>>> total number of mallocs used during MatSetValues calls =0 > >>>>>>>>>> using I-node (on process 0) routines: found 78749 > >>> nodes, limit used is 5 > >>>>>>>>>> > >>>>>>>>>> The pattern of convergence gives a hint that this system is > >>> somehow bad/singular. But I don't know why the preconditioned error > goes up > >>> too high. Anyone has an idea? > >>>>>>>>>> > >>>>>>>>>> Best regards > >>>>>>>>>> Giang Bui > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>> > >>>> > >>> > >>> > >