Re: [petsc-dev] Petsc "make test" have more failures for --with-openmp=1

Eric Chamberland Thu, 18 Mar 2021 18:46:49 -0700

Hi again,

ok, just saw that some matrices have lines of "0" in case of 3D hermiteDOFs (ex: du/dz derivatives) when used into a 2D plane mesh...


So, my last problem about hypre smoother is "normal".

However, just to play with one of this matrix, I tried to do a "LU" withmumps icntl_24 option activated on the global system: fine it works.

Then I tried to switche to GAMG with mumps for the coarse_sub level, butit seems my icntl_24 option is then ignored and I don't know why...


See my KSP:

KSP Object: (Options_ProjectionL2_0) 1 MPI processes
  type: bcgs
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-15, absolute=1e-15, divergence=1e+12
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: (Options_ProjectionL2_0) 1 MPI processes
  type: gamg
    type is MULTIPLICATIVE, levels=2 cycles=v
      Cycles per PCApply=1
      Using externally compute Galerkin coarse grid matrices
      GAMG specific options
        Threshold for dropping small values in graph on each level =
        Threshold scaling factor for each level not specified = 1.
        AGG specific options
          Symmetric graph false
          Number of levels to square graph 1
          Number smoothing steps 1
        Complexity:    grid = 1.09756
  Coarse grid solver -- level -------------------------------
    KSP Object: (Options_ProjectionL2_0mg_coarse_) 1 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (Options_ProjectionL2_0mg_coarse_) 1 MPI processes
      type: bjacobi
        number of blocks = 1

Local solver is the same for all blocks, as in the followingKSP and PC objects on rank 0:

      KSP Object: (Options_ProjectionL2_0mg_coarse_sub_) 1 MPI processes
        type: preonly
        maximum iterations=1, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        using NONE norm type for convergence test
      PC Object: (Options_ProjectionL2_0mg_coarse_sub_) 1 MPI processes
        type: lu
          out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
          matrix ordering: nd
          factor fill ratio given 0., needed 0.
            Factored matrix follows:
              Mat Object: 1 MPI processes
                type: mumps
                rows=8, cols=8
                package used to perform factorization: mumps
                total: nonzeros=64, allocated nonzeros=64
                  MUMPS run parameters:
                    SYM (matrix type):                   0
                    PAR (host participation):            1
                    ICNTL(1) (output for error):         6
                    ICNTL(2) (output of diagnostic msg): 0
                    ICNTL(3) (output for global info):   0
                    ICNTL(4) (level of printing):        0
                    ICNTL(5) (input mat struct):         0
                    ICNTL(6) (matrix prescaling):        7
                    ICNTL(7) (sequential matrix ordering):7
                    ICNTL(8) (scaling strategy):        77
                    ICNTL(10) (max num of refinements):  0
                    ICNTL(11) (error analysis):          0

ICNTL(12) (efficiencycontrol): 1 ICNTL(13) (sequential factorization of the rootnode): 0 ICNTL(14) (percentage of estimated workspaceincrease): 20 ICNTL(18) (input matstruct): 0 ICNTL(19) (Schur complementinfo): 0 ICNTL(20) (RHS sparsepattern): 0 ICNTL(21) (solutionstruct): 0 ICNTL(22) (in-core/out-of-corefacility): 0 ICNTL(23) (max size of memory can be allocatedlocally):0 ICNTL(24) (detection of null pivotrows): 0 ICNTL(25) (computation of a null spacebasis): 0 ICNTL(26) (Schur options for RHS orsolution): 0 ICNTL(27) (blocking size for multipleRHS): -32 ICNTL(28) (use parallel or sequentialordering): 1 ICNTL(29) (parallelordering): 0 ICNTL(30) (user-specified set of entries ininv(A)): 0 ICNTL(31) (factors is discarded in the solvephase): 0 ICNTL(33) (computedeterminant): 0 ICNTL(35) (activate BLR basedfactorization): 0 ICNTL(36) (choice of BLR factorizationvariant): 0 ICNTL(38) (estimated compression rate of LUfactors): 333

                    CNTL(1) (relative pivoting threshold): 0.01

CNTL(2) (stopping criterion of refinement):1.49012e-08

                    CNTL(3) (absolute pivoting threshold):      0.
                    CNTL(4) (value of static pivoting): -1.
                    CNTL(5) (fixation for null pivots):         0.
                    CNTL(7) (dropping parameter for BLR):       0.

RINFO(1) (local estimated flops for the eliminationafter analysis):

                      [0] 308.

RINFO(2) (local estimated flops for the assemblyafter factorization):

                      [0]  0.

RINFO(3) (local estimated flops for the eliminationafter factorization):

                      [0]  0.

INFO(15) (estimated size of (in MB) MUMPS internaldata for running numerical factorization):

                    [0] 0

INFO(16) (size of (in MB) MUMPS internal data usedduring numerical factorization):

                      [0] 0

INFO(23) (num of pivots eliminated on thisprocessor after factorization):

                      [0] 6

RINFOG(1) (global estimated flops for theelimination after analysis): 308. RINFOG(2) (global estimated flops for the assemblyafter factorization): 0. RINFOG(3) (global estimated flops for theelimination after factorization): 0. (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant):(0.,0.)*(2^0) INFOG(3) (estimated real workspace for factors onall processors after analysis): 64 INFOG(4) (estimated integer workspace for factorson all processors after analysis): 35 INFOG(5) (estimated maximum front size in thecomplete tree): 8

                    INFOG(6) (number of nodes in the complete tree): 1

INFOG(7) (ordering option effectively use afteranalysis): 2 INFOG(8) (structural symmetry in percent of thepermuted matrix after analysis): 100 INFOG(9) (total real/complex workspace to store thematrix factors after factorization): 64 INFOG(10) (total integer space store the matrixfactors after factorization): 35 INFOG(11) (order of largest frontal matrix afterfactorization): 8

                    INFOG(12) (number of off-diagonal pivots): 0

INFOG(13) (number of delayed pivots afterfactorization): 0 INFOG(14) (number of memory compress afterfactorization): 0 INFOG(15) (number of steps of iterative refinementafter solution): 0 INFOG(16) (estimated size (in MB) of all MUMPSinternal data for factorization after analysis: value on the most memoryconsuming processor): 0 INFOG(17) (estimated size of all MUMPS internaldata for factorization after analysis: sum over all processors): 0 INFOG(18) (size of all MUMPS internal dataallocated during factorization: value on the most memory consumingprocessor): 0 INFOG(19) (size of all MUMPS internal dataallocated during factorization: sum over all processors): 0 INFOG(20) (estimated number of entries in thefactors): 64 INFOG(21) (size in MB of memory effectively usedduring factorization - value on the most memory consuming processor): 0 INFOG(22) (size in MB of memory effectively usedduring factorization - sum over all processors): 0 INFOG(23) (after analysis: value of ICNTL(6)effectively used): 0 INFOG(24) (after analysis: value of ICNTL(12)effectively used): 1 INFOG(25) (after factorization: number of pivotsmodified by static pivoting): 0 INFOG(28) (after factorization: number of nullpivots encountered): 0 INFOG(29) (after factorization: effective number ofentries in the factors (sum over all processors)): 0 INFOG(30, 31) (after solution: size in Mbytes ofmemory used during solution phase): 0, 0

                    INFOG(32) (after analysis: type of analysis done): 1
                    INFOG(33) (value used for ICNTL(8)): 7

INFOG(34) (exponent of the determinant ifdeterminant is requested): 0 INFOG(35) (after factorization: number of entriestaking into account BLR factor compression - sum over all processors): 0 INFOG(36) (after analysis: estimated size of allMUMPS internal data for running BLR in-core - value on the most memoryconsuming processor): 0 INFOG(37) (after analysis: estimated size of allMUMPS internal data for running BLR in-core - sum over all processors): 0 INFOG(38) (after analysis: estimated size of allMUMPS internal data for running BLR out-of-core - value on the mostmemory consuming processor): 0 INFOG(39) (after analysis: estimated size of allMUMPS internal data for running BLR out-of-core - sum over allprocessors): 0

        linear system matrix = precond matrix:
        Mat Object: 1 MPI processes
          type: seqaij
          rows=8, cols=8, bs=4
          total: nonzeros=64, allocated nonzeros=64
          total number of mallocs used during MatSetValues calls=0
            using I-node routines: found 2 nodes, limit used is 5
      linear system matrix = precond matrix:
      Mat Object: 1 MPI processes
        type: seqaij
        rows=8, cols=8, bs=4
        total: nonzeros=64, allocated nonzeros=64
        total number of mallocs used during MatSetValues calls=0
          using I-node routines: found 2 nodes, limit used is 5
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object: (Options_ProjectionL2_0mg_levels_1_) 1 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0., max = 0.
        eigenvalues estimate via gmres min 0., max 0.

eigenvalues estimated using gmres with translations [0. 0.1;0. 1.1] KSP Object: (Options_ProjectionL2_0mg_levels_1_esteig_) 1 MPIprocesses

          type: gmres

restart=30, using Classical (unmodified) Gram-SchmidtOrthogonalization with no iterative refinement

            happy breakdown tolerance 1e-30
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
        PC Object: (Options_ProjectionL2_0mg_levels_1_) 1 MPI processes
          type: sor

type = local_symmetric, iterations = 1, local iterations =1, omega = 1.

          linear system matrix = precond matrix:
          Mat Object: (Options_ProjectionL2_0) 1 MPI processes
            type: seqaij
            rows=36, cols=36, bs=4
            total: nonzeros=656, allocated nonzeros=656
            total number of mallocs used during MatSetValues calls=0
              using I-node routines: found 9 nodes, limit used is 5
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (Options_ProjectionL2_0mg_levels_1_) 1 MPI processes
      type: sor

type = local_symmetric, iterations = 1, local iterations = 1,omega = 1.

      linear system matrix = precond matrix:
      Mat Object: (Options_ProjectionL2_0) 1 MPI processes
        type: seqaij
        rows=36, cols=36, bs=4
        total: nonzeros=656, allocated nonzeros=656
        total number of mallocs used during MatSetValues calls=0
          using I-node routines: found 9 nodes, limit used is 5
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object: (Options_ProjectionL2_0) 1 MPI processes
    type: seqaij
    rows=36, cols=36, bs=4
    total: nonzeros=656, allocated nonzeros=656
    total number of mallocs used during MatSetValues calls=0
      using I-node routines: found 9 nodes, limit used is 5

but I have this option left:

Option left:name:-Options_ProjectionL2_0mg_coarse_sub_mat_mumps_icntl_24 value: 1


and as you can see above I end with:

ICNTL(24) (detection of null pivotrows): 0


which is fatal in my case...

Can you see where I did wrong?

Thanks,

Eric

Re: [petsc-dev] Petsc "make test" have more failures for --with-openmp=1

Reply via email to