On Wed, Aug 18, 2021 at 12:52 PM Feimi Yu <[email protected]> wrote: > Hi, > > I was trying to run a simulation with a PETSc-wrapped Hypre > preconditioner, and encountered this problem: > > [dcs122:133012] Out of resources: all 4095 communicator IDs have been used. > [19]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [19]PETSC ERROR: General MPI error > [19]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error > [19]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [19]PETSC ERROR: Petsc Release Version 3.15.2, unknown > [19]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by CFSIfmyu Wed > Aug 11 19:51:47 2021 > [19]PETSC ERROR: [dcs122:133010] Out of resources: all 4095 communicator > IDs have been used. > [18]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [18]PETSC ERROR: General MPI error > [18]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error > [18]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [18]PETSC ERROR: Petsc Release Version 3.15.2, unknown > [18]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by CFSIfmyu Wed > Aug 11 19:51:47 2021 > [18]PETSC ERROR: Configure options --download-scalapack --download-mumps > --download-hypre --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 > --with-cudac=0 --with-debugging=0 > --with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/ > [18]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1> > MatCreate_HYPRE() at > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120 > [18]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2> MatSetType() at > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91 > [18]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3> > MatConvert_AIJ_HYPRE() at > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392 > [18]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4> MatConvert() at > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439 > [18]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5> PCSetUp_HYPRE() > at /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240 > [18]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6> PCSetUp() at > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015 > Configure options --download-scalapack --download-mumps --download-hypre > --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpif90 --with-cudac=0 > --with-debugging=0 > --with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/ > [19]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1> > MatCreate_HYPRE() at > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120 > [19]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2> MatSetType() at > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91 > [19]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3> > MatConvert_AIJ_HYPRE() at > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392 > [19]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4> MatConvert() at > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439 > [19]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5> PCSetUp_HYPRE() > at /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240 > [19]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6> PCSetUp() at > /gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015 > > It seems that MPI_Comm_dup() at petsc/src/mat/impls/hypre/mhypre.c:2120 > caused the problem. Since mine is a time-dependent problem, > MatCreate_HYPRE() is called every time the new system matrix is assembled. > The above error message is reported after ~4095 calls of MatCreate_HYPRE(), > which is around 455 time steps in my code. Here is some basic compiler > information: > Can you destroy old matrices to free MPI communicators? Otherwise, you run into a limitation we knew before.
> > IBM Spectrum MPI 10.4.0 > > GCC 8.4.1 > > I've never had this problem before with OpenMPI or MPICH implementation, > so I was wondering if this can be resolved from my end, or it's an > implementation specific problem. > > Thanks! > > Feimi >
