Hi Satish and Junchao,

I just tried replacing all MPI_COMM_WORLD with PETSC_COMM_WORLD, but it didn't do the trick. One thing that interests me is that, I ran with 40 ranks but only 2 ranks reported the communicator error. I think this means at least the rest 38 ranks freed the communicators properly.

Thanks!

Feimi

On 8/18/21 4:53 PM, Junchao Zhang wrote:
Hi, Feimi,
  I need to consult Jed (cc'ed).
  Jed, is this an example of https://lists.mcs.anl.gov/mailman/htdig/petsc-dev/2018-April/thread.html#22663 <https://lists.mcs.anl.gov/mailman/htdig/petsc-dev/2018-April/thread.html#22663>? If Feimi really can not free matrices, then we just need to attach a hypre-comm to a petsc inner comm, and pass that to hypre.

--Junchao Zhang


On Wed, Aug 18, 2021 at 3:38 PM Satish Balay <[email protected] <mailto:[email protected]>> wrote:

    Is the communicator used to create PETSc objects MPI_COMM_WORLD?

    If so - try changing it to PETSC_COMM_WORLD

    Satish

     On Wed, 18 Aug 2021, Feimi Yu wrote:

    > Hi Junchao,
    >
    > Thank you for the suggestion! I'm using the deal.ii wrapper
    > dealii::PETScWrappers::PreconditionBase to handle the PETSc
    preconditioners,
    > and the wrappers does the destroy when the preconditioner is
    reinitialized or
    > gets out of scope. I just double-checked, this is called to make
    sure the old
    > matrices are destroyed:
    >
    >    void
    >    PreconditionBase::clear()
    >    {
    >      matrix = nullptr;
    >
    >      if (pc != nullptr)
    >        {
    >          PetscErrorCode ierr = PCDestroy(&pc);
    >          pc                  = nullptr;
    >          AssertThrow(ierr == 0, ExcPETScError(ierr));
    >        }
    >    }
    >
    > Thanks!
    >
    > Feimi
    >
    > On 8/18/21 4:23 PM, Junchao Zhang wrote:
    > >
    > >
    > >
    > > On Wed, Aug 18, 2021 at 12:52 PM Feimi Yu <[email protected]
    <mailto:[email protected]>
    > > <mailto:[email protected] <mailto:[email protected]>>> wrote:
    > >
    > >     Hi,
    > >
    > >     I was trying to run a simulation with a PETSc-wrapped Hypre
    > >     preconditioner, and encountered this problem:
    > >
    > >     [dcs122:133012] Out of resources: all 4095 communicator
    IDs have
    > >     been used.
    > >     [19]PETSC ERROR: --------------------- Error Message
    > >  --------------------------------------------------------------
    > >     [19]PETSC ERROR: General MPI error
    > >     [19]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error
    > >     [19]PETSC ERROR: See
    > > https://www.mcs.anl.gov/petsc/documentation/faq.html
    <https://www.mcs.anl.gov/petsc/documentation/faq.html>
    > >     <https://www.mcs.anl.gov/petsc/documentation/faq.html
    <https://www.mcs.anl.gov/petsc/documentation/faq.html>> for trouble
    > >     shooting.
    > >     [19]PETSC ERROR: Petsc Release Version 3.15.2, unknown
    > >     [19]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by
    > >     CFSIfmyu Wed Aug 11 19:51:47 2021
    > >     [19]PETSC ERROR: [dcs122:133010] Out of resources: all 4095
    > >     communicator IDs have been used.
    > >     [18]PETSC ERROR: --------------------- Error Message
    > >  --------------------------------------------------------------
    > >     [18]PETSC ERROR: General MPI error
    > >     [18]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error
    > >     [18]PETSC ERROR: See
    > > https://www.mcs.anl.gov/petsc/documentation/faq.html
    <https://www.mcs.anl.gov/petsc/documentation/faq.html>
    > >     <https://www.mcs.anl.gov/petsc/documentation/faq.html
    <https://www.mcs.anl.gov/petsc/documentation/faq.html>> for trouble
    > >     shooting.
    > >     [18]PETSC ERROR: Petsc Release Version 3.15.2, unknown
    > >     [18]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by
    > >     CFSIfmyu Wed Aug 11 19:51:47 2021
    > >     [18]PETSC ERROR: Configure options --download-scalapack
    > >     --download-mumps --download-hypre --with-cc=mpicc
    > >     --with-cxx=mpicxx --with-fc=mpif90 --with-cudac=0
    > >     --with-debugging=0
    > >
     
--with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/
    > >     [18]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1
    <https://itssc.rpi.edu/hc/requests/1>>
    > >     MatCreate_HYPRE() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120
    > >     [18]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2
    <https://itssc.rpi.edu/hc/requests/2>>
    > >     MatSetType() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91
    > >     [18]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3
    <https://itssc.rpi.edu/hc/requests/3>>
    > >     MatConvert_AIJ_HYPRE() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392
    > >     [18]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4
    <https://itssc.rpi.edu/hc/requests/4>>
    > >     MatConvert() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439
    > >     [18]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5
    <https://itssc.rpi.edu/hc/requests/5>>
    > >     PCSetUp_HYPRE() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240
    > >     [18]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6
    <https://itssc.rpi.edu/hc/requests/6>>
    > >     PCSetUp() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015
    > >     Configure options --download-scalapack --download-mumps
    > >     --download-hypre --with-cc=mpicc --with-cxx=mpicxx
    > >     --with-fc=mpif90 --with-cudac=0 --with-debugging=0
    > >
     
--with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/
    > >     [19]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1
    <https://itssc.rpi.edu/hc/requests/1>>
    > >     MatCreate_HYPRE() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120
    > >     [19]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2
    <https://itssc.rpi.edu/hc/requests/2>>
    > >     MatSetType() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91
    > >     [19]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3
    <https://itssc.rpi.edu/hc/requests/3>>
    > >     MatConvert_AIJ_HYPRE() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392
    > >     [19]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4
    <https://itssc.rpi.edu/hc/requests/4>>
    > >     MatConvert() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439
    > >     [19]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5
    <https://itssc.rpi.edu/hc/requests/5>>
    > >     PCSetUp_HYPRE() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240
    > >     [19]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6
    <https://itssc.rpi.edu/hc/requests/6>>
    > >     PCSetUp() at
    > >
     /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015
    > >
    > >     It seems that MPI_Comm_dup() at
    > >     petsc/src/mat/impls/hypre/mhypre.c:2120 caused the
    problem. Since
    > >     mine is a time-dependent problem, MatCreate_HYPRE() is called
    > >     every time the new system matrix is assembled. The above error
    > >     message is reported after ~4095 calls of
    MatCreate_HYPRE(), which
    > >     is around 455 time steps in my code. Here is some basic
    compiler
    > >     information:
    > >
    > > Can you destroy old matrices to free MPI communicators?
    Otherwise, you run
    > > into a limitation we knew before.
    > >
    > >     IBM Spectrum MPI 10.4.0
    > >
    > >     GCC 8.4.1
    > >
    > >     I've never had this problem before with OpenMPI or MPICH
    > >     implementation, so I was wondering if this can be resolved
    from my
    > >     end, or it's an implementation specific problem.
    > >
    > >     Thanks!
    > >
    > >     Feimi
    > >
    >
>

Reply via email to