Hi again,

ok, just saw that some matrices have lines of "0" in case of 3D hermite DOFs (ex: du/dz derivatives) when used into a 2D plane mesh...

So, my last problem about hypre smoother is "normal".

However, just to play with one of this matrix, I tried to do a "LU" with mumps icntl_24 option activated on the global system: fine it works.

Then I tried to switche to GAMG with mumps for the coarse_sub level, but it seems my icntl_24 option is then ignored and I don't know why...

See my KSP:

KSP Object: (Options_ProjectionL2_0) 1 MPI processes
  type: bcgs
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-15, absolute=1e-15, divergence=1e+12
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: (Options_ProjectionL2_0) 1 MPI processes
  type: gamg
    type is MULTIPLICATIVE, levels=2 cycles=v
      Cycles per PCApply=1
      Using externally compute Galerkin coarse grid matrices
      GAMG specific options
        Threshold for dropping small values in graph on each level =
        Threshold scaling factor for each level not specified = 1.
        AGG specific options
          Symmetric graph false
          Number of levels to square graph 1
          Number smoothing steps 1
        Complexity:    grid = 1.09756
  Coarse grid solver -- level -------------------------------
    KSP Object: (Options_ProjectionL2_0mg_coarse_) 1 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (Options_ProjectionL2_0mg_coarse_) 1 MPI processes
      type: bjacobi
        number of blocks = 1
        Local solver is the same for all blocks, as in the following KSP and PC objects on rank 0:
      KSP Object: (Options_ProjectionL2_0mg_coarse_sub_) 1 MPI processes
        type: preonly
        maximum iterations=1, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        using NONE norm type for convergence test
      PC Object: (Options_ProjectionL2_0mg_coarse_sub_) 1 MPI processes
        type: lu
          out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
          matrix ordering: nd
          factor fill ratio given 0., needed 0.
            Factored matrix follows:
              Mat Object: 1 MPI processes
                type: mumps
                rows=8, cols=8
                package used to perform factorization: mumps
                total: nonzeros=64, allocated nonzeros=64
                  MUMPS run parameters:
                    SYM (matrix type):                   0
                    PAR (host participation):            1
                    ICNTL(1) (output for error):         6
                    ICNTL(2) (output of diagnostic msg): 0
                    ICNTL(3) (output for global info):   0
                    ICNTL(4) (level of printing):        0
                    ICNTL(5) (input mat struct):         0
                    ICNTL(6) (matrix prescaling):        7
                    ICNTL(7) (sequential matrix ordering):7
                    ICNTL(8) (scaling strategy):        77
                    ICNTL(10) (max num of refinements):  0
                    ICNTL(11) (error analysis):          0
                    ICNTL(12) (efficiency control):                         1                     ICNTL(13) (sequential factorization of the root node):  0                     ICNTL(14) (percentage of estimated workspace increase): 20                     ICNTL(18) (input mat struct):                           0                     ICNTL(19) (Schur complement info):                      0                     ICNTL(20) (RHS sparse pattern):                         0                     ICNTL(21) (solution struct):                            0                     ICNTL(22) (in-core/out-of-core facility):               0                     ICNTL(23) (max size of memory can be allocated locally):0                     ICNTL(24) (detection of null pivot rows):               0                     ICNTL(25) (computation of a null space basis):          0                     ICNTL(26) (Schur options for RHS or solution):          0                     ICNTL(27) (blocking size for multiple RHS):             -32                     ICNTL(28) (use parallel or sequential ordering):        1                     ICNTL(29) (parallel ordering):                          0                     ICNTL(30) (user-specified set of entries in inv(A)):    0                     ICNTL(31) (factors is discarded in the solve phase):    0                     ICNTL(33) (compute determinant):                        0                     ICNTL(35) (activate BLR based factorization):           0                     ICNTL(36) (choice of BLR factorization variant):        0                     ICNTL(38) (estimated compression rate of LU factors):   333
                    CNTL(1) (relative pivoting threshold): 0.01
                    CNTL(2) (stopping criterion of refinement): 1.49012e-08
                    CNTL(3) (absolute pivoting threshold):      0.
                    CNTL(4) (value of static pivoting): -1.
                    CNTL(5) (fixation for null pivots):         0.
                    CNTL(7) (dropping parameter for BLR):       0.
                    RINFO(1) (local estimated flops for the elimination after analysis):
                      [0] 308.
                    RINFO(2) (local estimated flops for the assembly after factorization):
                      [0]  0.
                    RINFO(3) (local estimated flops for the elimination after factorization):
                      [0]  0.
                    INFO(15) (estimated size of (in MB) MUMPS internal data for running numerical factorization):
                    [0] 0
                    INFO(16) (size of (in MB) MUMPS internal data used during numerical factorization):
                      [0] 0
                    INFO(23) (num of pivots eliminated on this processor after factorization):
                      [0] 6
                    RINFOG(1) (global estimated flops for the elimination after analysis): 308.                     RINFOG(2) (global estimated flops for the assembly after factorization): 0.                     RINFOG(3) (global estimated flops for the elimination after factorization): 0.                     (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant): (0.,0.)*(2^0)                     INFOG(3) (estimated real workspace for factors on all processors after analysis): 64                     INFOG(4) (estimated integer workspace for factors on all processors after analysis): 35                     INFOG(5) (estimated maximum front size in the complete tree): 8
                    INFOG(6) (number of nodes in the complete tree): 1
                    INFOG(7) (ordering option effectively use after analysis): 2                     INFOG(8) (structural symmetry in percent of the permuted matrix after analysis): 100                     INFOG(9) (total real/complex workspace to store the matrix factors after factorization): 64                     INFOG(10) (total integer space store the matrix factors after factorization): 35                     INFOG(11) (order of largest frontal matrix after factorization): 8
                    INFOG(12) (number of off-diagonal pivots): 0
                    INFOG(13) (number of delayed pivots after factorization): 0                     INFOG(14) (number of memory compress after factorization): 0                     INFOG(15) (number of steps of iterative refinement after solution): 0                     INFOG(16) (estimated size (in MB) of all MUMPS internal data for factorization after analysis: value on the most memory consuming processor): 0                     INFOG(17) (estimated size of all MUMPS internal data for factorization after analysis: sum over all processors): 0                     INFOG(18) (size of all MUMPS internal data allocated during factorization: value on the most memory consuming processor): 0                     INFOG(19) (size of all MUMPS internal data allocated during factorization: sum over all processors): 0                     INFOG(20) (estimated number of entries in the factors): 64                     INFOG(21) (size in MB of memory effectively used during factorization - value on the most memory consuming processor): 0                     INFOG(22) (size in MB of memory effectively used during factorization - sum over all processors): 0                     INFOG(23) (after analysis: value of ICNTL(6) effectively used): 0                     INFOG(24) (after analysis: value of ICNTL(12) effectively used): 1                     INFOG(25) (after factorization: number of pivots modified by static pivoting): 0                     INFOG(28) (after factorization: number of null pivots encountered): 0                     INFOG(29) (after factorization: effective number of entries in the factors (sum over all processors)): 0                     INFOG(30, 31) (after solution: size in Mbytes of memory used during solution phase): 0, 0
                    INFOG(32) (after analysis: type of analysis done): 1
                    INFOG(33) (value used for ICNTL(8)): 7
                    INFOG(34) (exponent of the determinant if determinant is requested): 0                     INFOG(35) (after factorization: number of entries taking into account BLR factor compression - sum over all processors): 0                     INFOG(36) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - value on the most memory consuming processor): 0                     INFOG(37) (after analysis: estimated size of all MUMPS internal data for running BLR in-core - sum over all processors): 0                     INFOG(38) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - value on the most memory consuming processor): 0                     INFOG(39) (after analysis: estimated size of all MUMPS internal data for running BLR out-of-core - sum over all processors): 0
        linear system matrix = precond matrix:
        Mat Object: 1 MPI processes
          type: seqaij
          rows=8, cols=8, bs=4
          total: nonzeros=64, allocated nonzeros=64
          total number of mallocs used during MatSetValues calls=0
            using I-node routines: found 2 nodes, limit used is 5
      linear system matrix = precond matrix:
      Mat Object: 1 MPI processes
        type: seqaij
        rows=8, cols=8, bs=4
        total: nonzeros=64, allocated nonzeros=64
        total number of mallocs used during MatSetValues calls=0
          using I-node routines: found 2 nodes, limit used is 5
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object: (Options_ProjectionL2_0mg_levels_1_) 1 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0., max = 0.
        eigenvalues estimate via gmres min 0., max 0.
        eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]         KSP Object: (Options_ProjectionL2_0mg_levels_1_esteig_) 1 MPI processes
          type: gmres
            restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
            happy breakdown tolerance 1e-30
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
        PC Object: (Options_ProjectionL2_0mg_levels_1_) 1 MPI processes
          type: sor
            type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
          linear system matrix = precond matrix:
          Mat Object: (Options_ProjectionL2_0) 1 MPI processes
            type: seqaij
            rows=36, cols=36, bs=4
            total: nonzeros=656, allocated nonzeros=656
            total number of mallocs used during MatSetValues calls=0
              using I-node routines: found 9 nodes, limit used is 5
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (Options_ProjectionL2_0mg_levels_1_) 1 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object: (Options_ProjectionL2_0) 1 MPI processes
        type: seqaij
        rows=36, cols=36, bs=4
        total: nonzeros=656, allocated nonzeros=656
        total number of mallocs used during MatSetValues calls=0
          using I-node routines: found 9 nodes, limit used is 5
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object: (Options_ProjectionL2_0) 1 MPI processes
    type: seqaij
    rows=36, cols=36, bs=4
    total: nonzeros=656, allocated nonzeros=656
    total number of mallocs used during MatSetValues calls=0
      using I-node routines: found 9 nodes, limit used is 5

but I have this option left:

Option left: name:-Options_ProjectionL2_0mg_coarse_sub_mat_mumps_icntl_24 value: 1

and as you can see above I end with:

                    ICNTL(24) (detection of null pivot rows):               0

which is fatal in my case...

Can you see where I did wrong?

Thanks,

Eric


Reply via email to