Re: [petsc-dev] CUDA GAMG coarse grid solver

Mark Adams via petsc-dev Sun, 21 Jul 2019 18:10:13 -0700

Barry,

Option left: name:-mg_coarse_mat_solver_type value: cusparse


I tried this too:

Option left: name:-mg_coarse_sub_mat_solver_type value: cusparse

Here is the view. cuda did not get into the factor type.

PC Object: 24 MPI processes
  type: gamg
    type is MULTIPLICATIVE, levels=5 cycles=v
      Cycles per PCApply=1
      Using externally compute Galerkin coarse grid matrices
      GAMG specific options
        Threshold for dropping small values in graph on each level =   0.05
  0.025   0.0125
        Threshold scaling factor for each level not specified = 0.5
        AGG specific options
          Symmetric graph false
          Number of levels to square graph 10
          Number smoothing steps 1
        Complexity:    grid = 1.14213
  Coarse grid solver -- level -------------------------------
    KSP Object: (mg_coarse_) 24 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_coarse_) 24 MPI processes
      type: bjacobi
        number of blocks = 24
        Local solve is same for all blocks, in the following KSP and PC
objects:
      KSP Object: (mg_coarse_sub_) 1 MPI processes
        type: preonly
        maximum iterations=1, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        using NONE norm type for convergence test
      PC Object: (mg_coarse_sub_) 1 MPI processes
        type: lu
          out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
          matrix ordering: nd
          factor fill ratio given 5., needed 1.
            Factored matrix follows:
              Mat Object: 1 MPI processes
                *type: seqaij*
                rows=6, cols=6
                package used to perform factorization: petsc
                total: nonzeros=36, allocated nonzeros=36
                total number of mallocs used during MatSetValues calls =0
                  using I-node routines: found 2 nodes, limit used is 5
        linear system matrix = precond matrix:
        Mat Object: 1 MPI processes
          *type: seqaijcusparse*
          rows=6, cols=6
          total: nonzeros=36, allocated nonzeros=36
          total number of mallocs used during MatSetValues calls =0
            using I-node routines: found 2 nodes, limit used is 5
      linear system matrix = precond matrix:
      Mat Object: 24 MPI processes
       * type: mpiaijcusparse*
        rows=6, cols=6, bs=6
        total: nonzeros=36, allocated nonzeros=36
        total number of mallocs used during MatSetValues calls =0
          using scalable MatPtAP() implementation
          using I-node (on process 0) routines: found 2 nodes, limit used
is 5
  Down solver (pre-smoother) on level 1 -------------------------------



On Sun, Jul 21, 2019 at 3:58 PM Mark Adams <mfad...@lbl.gov> wrote:

> Barry, I do NOT see communication. This is what made me think it was not
> running on the GPU. I added print statements and found that
> MatSolverTypeRegister_CUSPARSE IS called but (what it registers)
> MatGetFactor_seqaijcusparse_cusparse does NOT get called.
>
> I have a job waiting on the queue. I'll send ksp_view when it runs. I will
> try -mg_coarse_mat_solver_type cusparse. That is probably the problem.
> Maybe I should set the coarse grid solver in a more robust way in GAMG,
> like use the matrix somehow? I currently use PCSetType(pc, PCLU).
>
> I can't get an interactive shell now to run DDT, but I can try stepping
> through from MatGetFactor to see what its doing.
>
> Thanks,
> Mark
>
> On Sun, Jul 21, 2019 at 11:14 AM Smith, Barry F. <bsm...@mcs.anl.gov>
> wrote:
>
>>
>>
>> > On Jul 21, 2019, at 8:55 AM, Mark Adams via petsc-dev <
>> petsc-dev@mcs.anl.gov> wrote:
>> >
>> > I am running ex56 with -ex56_dm_vec_type cuda -ex56_dm_mat_type
>> aijcusparse and I see no GPU communication in MatSolve (the serial LU
>> coarse grid solver).
>>
>>    Do you mean to say, you DO see communication?
>>
>>    What does -ksp_view should you? It should show the factor type in the
>> information about the coarse grid solve?
>>
>>    You might need something like -mg_coarse_mat_solver_type cusparse
>> (because it may default to the PETSc one, it may be possible to have it
>> default to the cusparse if it exists and the matrix is of type
>> MATSEQAIJCUSPARSE).
>>
>>    The determination of the MatGetFactor() is a bit involved including
>> pasting together strings and string compares and could be finding a CPU
>> factorization.
>>
>>    I could run on one MPI_Rank() in the debugger and put a break point in
>> MatGetFactor() and track along to see what it picks and why. You could do
>> this debugging without GAMG first, just -pc_type lu
>>
>> > GAMG does set the coarse grid solver to LU manually like this: ierr =
>> PCSetType(pc2, PCLU);CHKERRQ(ierr);
>>
>>   For parallel runs this won't work using the GPU code and only
>> sequential direct solvers, so it must using BJACOBI in that case?
>>
>>    Barry
>>
>>
>>
>>
>>
>> > I am thinking the dispatch of the CUDA version of this got dropped
>> somehow.
>> >
>> > I see that this is getting called:
>> >
>> > PETSC_EXTERN PetscErrorCode MatSolverTypeRegister_CUSPARSE(void)
>> > {
>> >   PetscErrorCode ierr;
>> >
>> >   PetscFunctionBegin;
>> >   ierr =
>> MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_LU,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
>> >   ierr =
>> MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_CHOLESKY,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
>> >   ierr =
>> MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_ILU,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
>> >   ierr =
>> MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_ICC,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
>> >   PetscFunctionReturn(0);
>> > }
>> >
>> > but MatGetFactor_seqaijcusparse_cusparse is not getting  called.
>> >
>> > GAMG does set the coarse grid solver to LU manually like this: ierr =
>> PCSetType(pc2, PCLU);CHKERRQ(ierr);
>> >
>> > Any ideas?
>> >
>> > Thanks,
>> > Mark
>> >
>> >
>>
>>

Re: [petsc-dev] CUDA GAMG coarse grid solver

Reply via email to