My previous message may sound misleading. This problem happens despite
the fact that the old matrices are destroyed.
Feimi
On 8/18/21 4:31 PM, Feimi Yu wrote:
Hi Junchao,
Thank you for the suggestion! I'm using the deal.ii wrapper
dealii::PETScWrappers::PreconditionBase to handle the PETSc
preconditioners, and the wrappers does the destroy when the
preconditioner is reinitialized or gets out of scope. I just
double-checked, this is called to make sure the old matrices are
destroyed:
void
PreconditionBase::clear()
{
matrix = nullptr;
if (pc != nullptr)
{
PetscErrorCode ierr = PCDestroy(&pc);
pc = nullptr;
AssertThrow(ierr == 0, ExcPETScError(ierr));
}
}
Thanks!
Feimi
On 8/18/21 4:23 PM, Junchao Zhang wrote:
On Wed, Aug 18, 2021 at 12:52 PM Feimi Yu <[email protected]
<mailto:[email protected]>> wrote:
Hi,
I was trying to run a simulation with a PETSc-wrapped Hypre
preconditioner, and encountered this problem:
[dcs122:133012] Out of resources: all 4095 communicator IDs have
been used.
[19]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[19]PETSC ERROR: General MPI error
[19]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error
[19]PETSC ERROR: See
https://www.mcs.anl.gov/petsc/documentation/faq.html
<https://www.mcs.anl.gov/petsc/documentation/faq.html> for
trouble shooting.
[19]PETSC ERROR: Petsc Release Version 3.15.2, unknown
[19]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by
CFSIfmyu Wed Aug 11 19:51:47 2021
[19]PETSC ERROR: [dcs122:133010] Out of resources: all 4095
communicator IDs have been used.
[18]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[18]PETSC ERROR: General MPI error
[18]PETSC ERROR: MPI error 17 MPI_ERR_INTERN: internal error
[18]PETSC ERROR: See
https://www.mcs.anl.gov/petsc/documentation/faq.html
<https://www.mcs.anl.gov/petsc/documentation/faq.html> for
trouble shooting.
[18]PETSC ERROR: Petsc Release Version 3.15.2, unknown
[18]PETSC ERROR: ./main on a arch-linux-c-opt named dcs122 by
CFSIfmyu Wed Aug 11 19:51:47 2021
[18]PETSC ERROR: Configure options --download-scalapack
--download-mumps --download-hypre --with-cc=mpicc
--with-cxx=mpicxx --with-fc=mpif90 --with-cudac=0
--with-debugging=0
--with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/
[18]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1>
MatCreate_HYPRE() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120
[18]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2>
MatSetType() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91
[18]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3>
MatConvert_AIJ_HYPRE() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392
[18]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4>
MatConvert() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439
[18]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5>
PCSetUp_HYPRE() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240
[18]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6>
PCSetUp() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015
Configure options --download-scalapack --download-mumps
--download-hypre --with-cc=mpicc --with-cxx=mpicxx
--with-fc=mpif90 --with-cudac=0 --with-debugging=0
--with-blaslapack-dir=/gpfs/u/home/CFSI/CFSIfmyu/barn-shared/dcs-rh8/lapack-build/
[19]PETSC ERROR: #1 <https://itssc.rpi.edu/hc/requests/1>
MatCreate_HYPRE() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:2120
[19]PETSC ERROR: #2 <https://itssc.rpi.edu/hc/requests/2>
MatSetType() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matreg.c:91
[19]PETSC ERROR: #3 <https://itssc.rpi.edu/hc/requests/3>
MatConvert_AIJ_HYPRE() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/impls/hypre/mhypre.c:392
[19]PETSC ERROR: #4 <https://itssc.rpi.edu/hc/requests/4>
MatConvert() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/mat/interface/matrix.c:4439
[19]PETSC ERROR: #5 <https://itssc.rpi.edu/hc/requests/5>
PCSetUp_HYPRE() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/impls/hypre/hypre.c:240
[19]PETSC ERROR: #6 <https://itssc.rpi.edu/hc/requests/6>
PCSetUp() at
/gpfs/u/barn/CFSI/shared/dcs-rh8/petsc/src/ksp/pc/interface/precon.c:1015
It seems that MPI_Comm_dup() at
petsc/src/mat/impls/hypre/mhypre.c:2120 caused the problem. Since
mine is a time-dependent problem, MatCreate_HYPRE() is called
every time the new system matrix is assembled. The above error
message is reported after ~4095 calls of MatCreate_HYPRE(), which
is around 455 time steps in my code. Here is some basic compiler
information:
Can you destroy old matrices to free MPI communicators? Otherwise,
you run into a limitation we knew before.
IBM Spectrum MPI 10.4.0
GCC 8.4.1
I've never had this problem before with OpenMPI or MPICH
implementation, so I was wondering if this can be resolved from
my end, or it's an implementation specific problem.
Thanks!
Feimi