Hi, Feimi, I need to consult Jed (cc'ed). Jed, is this an example of https://lists.mcs.anl.gov/mailman/htdig/petsc-dev/2018-April/thread.html#22663? If Feimi really can not free matrices, then we just need to attach a hypre-comm to a petsc inner comm, and pass that to hypre.
--Junchao Zhang On Wed, Aug 18, 2021 at 3:38 PM Satish Balay <[email protected]> wrote: > Is the communicator used to create PETSc objects MPI_COMM_WORLD? > > If so - try changing it to PETSC_COMM_WORLD > > Satish > > On Wed, 18 Aug 2021, Feimi Yu wrote: > > > Hi Junchao, > > > > Thank you for the suggestion! I'm using the deal.ii wrapper > > dealii::PETScWrappers::PreconditionBase to handle the PETSc > preconditioners, > > and the wrappers does the destroy when the preconditioner is > reinitialized or > > gets out of scope. I just double-checked, this is called to make sure > the old > > matrices are destroyed: > > > > void > > PreconditionBase::clear() > > { > > matrix = nullptr; > > > > if (pc != nullptr) > > { > > PetscErrorCode ierr = PCDestroy(&pc); > > pc = nullptr; > > AssertThrow(ierr == 0, ExcPETScError(ierr)); > > } > > } > > > > Thanks! > > > > Feimi > > > > On 8/18/21 4:23 PM, Junchao Zhang wrote: > > > > > > > > > > > > On Wed, Aug 18, 2021 at 12:52 PM Feimi Yu <[email protected] > > > <mailto:[email protected]>> wrote: > > > > > > Hi, > > > > > > I was trying to run a simulation with a PETSc-wrapped Hypre > > > preconditioner, and encountered this problem: > > > > > > [dcs122:133012] Out of resources: all 4095 communicator IDs have > > > been used. > > > [19]PETSC ERROR: --------------------- Error Message > > > -------------------------------------------------------------- > > > [19]PETSC ERROR: General MPI error > > > [19]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error > > > [19]PETSC ERROR: See > > > https://www.mcs.anl.gov/petsc/documentation/faq.html > > > <https://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble > > > shooting. > > > [19]PETSC ERROR: Petsc Release Version 3.15.2, unknown > > > [19]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by > > > CFSIfmyu Wed Aug 11 19:51:47 2021 > > > [19]PETSC ERROR: [dcs122:133010] Out of resources: all 4095 > > > communicator IDs have been used. > > > [18]PETSC ERROR: --------------------- Error Message > > > -------------------------------------------------------------- > > > [18]PETSC ERROR: General MPI error > > > [18]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error > > > [18]PETSC ERROR: See > > > https://www.mcs.anl.gov/petsc/documentation/faq.html > > > <https://www.mcs.anl.gov/petsc/documentation/faq.html> for trouble > > > shooting. > > > [18]PETSC ERROR: Petsc Release Version 3.15.2, unknown > > > [18]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by > > > CFSIfmyu Wed Aug 11 19:51:47 2021 > > > [18]PETSC ERROR: Configure options --download-scalapack > > > --download-mumps --download-hypre --with-cc=mpicc > > > --with-cxx=mpicxx --with-fc=mpif90 --with-cudac=0 > > > --with-debugging=0 > > > > > --with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/ > > > [18]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1> > > > MatCreate_HYPRE() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120 > > > [18]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2> > > > MatSetType() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91 > > > [18]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3> > > > MatConvert_AIJ_HYPRE() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392 > > > [18]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4> > > > MatConvert() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439 > > > [18]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5> > > > PCSetUp_HYPRE() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240 > > > [18]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6> > > > PCSetUp() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015 > > > Configure options --download-scalapack --download-mumps > > > --download-hypre --with-cc=mpicc --with-cxx=mpicxx > > > --with-fc=mpif90 --with-cudac=0 --with-debugging=0 > > > > > --with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/ > > > [19]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1> > > > MatCreate_HYPRE() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120 > > > [19]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2> > > > MatSetType() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91 > > > [19]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3> > > > MatConvert_AIJ_HYPRE() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392 > > > [19]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4> > > > MatConvert() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439 > > > [19]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5> > > > PCSetUp_HYPRE() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240 > > > [19]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6> > > > PCSetUp() at > > > > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015 > > > > > > It seems that MPI_Comm_dup() at > > > petsc/src/mat/impls/hypre/mhypre.c:2120 caused the problem. Since > > > mine is a time-dependent problem, MatCreate_HYPRE() is called > > > every time the new system matrix is assembled. The above error > > > message is reported after ~4095 calls of MatCreate_HYPRE(), which > > > is around 455 time steps in my code. Here is some basic compiler > > > information: > > > > > > Can you destroy old matrices to free MPI communicators? Otherwise, you > run > > > into a limitation we knew before. > > > > > > IBM Spectrum MPI 10.4.0 > > > > > > GCC 8.4.1 > > > > > > I've never had this problem before with OpenMPI or MPICH > > > implementation, so I was wondering if this can be resolved from my > > > end, or it's an implementation specific problem. > > > > > > Thanks! > > > > > > Feimi > > > > > > >
