My previous message may sound misleading. This problem happens despite the fact that the old matrices are destroyed.

Feimi

On 8/18/21 4:31 PM, Feimi Yu wrote:

Hi Junchao,

Thank you for the suggestion! I'm using the deal.ii wrapper dealii::PETScWrappers::PreconditionBase to handle the PETSc preconditioners, and the wrappers does the destroy when the preconditioner is reinitialized or gets out of scope. I just double-checked, this is called to make sure the old matrices are destroyed:

   void
   PreconditionBase::clear()
   {
     matrix = nullptr;

     if (pc != nullptr)
       {
         PetscErrorCode ierr = PCDestroy(&pc);
         pc                  = nullptr;
         AssertThrow(ierr == 0, ExcPETScError(ierr));
       }
   }

Thanks!

Feimi

On 8/18/21 4:23 PM, Junchao Zhang wrote:



On Wed, Aug 18, 2021 at 12:52 PM Feimi Yu <[email protected] <mailto:[email protected]>> wrote:

    Hi,

    I was trying to run a simulation with a PETSc-wrapped Hypre
    preconditioner, and encountered this problem:

    [dcs122:133012] Out of resources: all 4095 communicator IDs have
    been used.
    [19]PETSC ERROR: --------------------- Error Message
    --------------------------------------------------------------
    [19]PETSC ERROR: General MPI error
    [19]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error
    [19]PETSC ERROR: See
    https://www.mcs.anl.gov/petsc/documentation/faq.html
    <https://www.mcs.anl.gov/petsc/documentation/faq.html> for
    trouble shooting.
    [19]PETSC ERROR: Petsc Release Version 3.15.2, unknown
    [19]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by
    CFSIfmyu Wed Aug 11 19:51:47 2021
    [19]PETSC ERROR: [dcs122:133010] Out of resources: all 4095
    communicator IDs have been used.
    [18]PETSC ERROR: --------------------- Error Message
    --------------------------------------------------------------
    [18]PETSC ERROR: General MPI error
    [18]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error
    [18]PETSC ERROR: See
    https://www.mcs.anl.gov/petsc/documentation/faq.html
    <https://www.mcs.anl.gov/petsc/documentation/faq.html> for
    trouble shooting.
    [18]PETSC ERROR: Petsc Release Version 3.15.2, unknown
    [18]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by
    CFSIfmyu Wed Aug 11 19:51:47 2021
    [18]PETSC ERROR: Configure options --download-scalapack
    --download-mumps --download-hypre --with-cc=mpicc
    --with-cxx=mpicxx --with-fc=mpif90 --with-cudac=0
    --with-debugging=0
    
--with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/
    [18]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1>
    MatCreate_HYPRE() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120
    [18]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2>
    MatSetType() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91
    [18]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3>
    MatConvert_AIJ_HYPRE() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392
    [18]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4>
    MatConvert() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439
    [18]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5>
    PCSetUp_HYPRE() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240
    [18]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6>
    PCSetUp() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015
    Configure options --download-scalapack --download-mumps
    --download-hypre --with-cc=mpicc --with-cxx=mpicxx
    --with-fc=mpif90 --with-cudac=0 --with-debugging=0
    
--with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/
    [19]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1>
    MatCreate_HYPRE() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120
    [19]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2>
    MatSetType() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91
    [19]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3>
    MatConvert_AIJ_HYPRE() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392
    [19]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4>
    MatConvert() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439
    [19]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5>
    PCSetUp_HYPRE() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240
    [19]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6>
    PCSetUp() at
    /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015

    It seems that MPI_Comm_dup() at
    petsc/src/mat/impls/hypre/mhypre.c:2120 caused the problem. Since
    mine is a time-dependent problem, MatCreate_HYPRE() is called
    every time the new system matrix is assembled. The above error
    message is reported after ~4095 calls of MatCreate_HYPRE(), which
    is around 455 time steps in my code. Here is some basic compiler
    information:

Can you destroy old matrices to free MPI communicators?  Otherwise, you run into a limitation we knew before.

    IBM Spectrum MPI 10.4.0

    GCC 8.4.1

    I've never had this problem before with OpenMPI or MPICH
    implementation, so I was wondering if this can be resolved from
    my end, or it's an implementation specific problem.

    Thanks!

    Feimi

Reply via email to