Alexander, I reproduced the error with Intel MPI 2019.3.199 and I can confirm it is because Intel MPI_Type_get_envelope() is wrong.
--Junchao Zhang On Tue, May 12, 2020 at 3:45 PM Alexander Lindsay <[email protected]> wrote: > Ok, this is good to know. Yea we'll probably just roll back then. Thanks! > > On Tue, May 12, 2020 at 12:45 PM Satish Balay <[email protected]> wrote: > >> On Tue, 12 May 2020, Matthew Knepley wrote: >> >> > On Tue, May 12, 2020 at 3:13 PM Alexander Lindsay < >> [email protected]> >> > wrote: >> > >> > > The parallel make check target (ex19) fails with the error below after >> > > configuring/building with intel 2019 mpi compilers >> > > (mpiicc,mpiicpc,mpiifort). Any attempt to run valgrind or to attach >> to a >> > > debugger fails with `mpiexec: Error: unknown option "-pmi_args"`. I've >> > > attached configure.log. Does anyone have any ideas off the top of >> their >> > > head? We're trying to link MOOSE with a project that refuses to use a >> > > toolchain other than intel's. I'm currently trying to figure out >> whether >> > > the MPI implementation matters (e.g. can I use mpich/openmpi), but >> for now >> > > I'm operating under the assumption that I need to use the intel MPI >> > > implementation. >> > > >> > >> > There have been a _lot_ of bugs in the 2019 MPI for some reason. Is it >> at >> > all possible to rollback? >> > >> > If not, is this somewhere we can run? >> >> We have this compiler/mpi [19u3] on our KNL box. I've had weird issues >> with it - so we still use 18u2 on it. >> >> Satish >> >> > >> > Thanks, >> > >> > Matt >> > >> > >> > > lindad@lemhi2 >> :/scratch/lindad/moose/petsc/src/snes/examples/tutorials((detached >> > > from 7c25e2d))$ mpiexec -np 2 ./ex19 >> > > lid velocity = 0.0625, prandtl # = 1., grashof # = 1. >> > > [0]PETSC ERROR: >> > > >> ------------------------------------------------------------------------ >> > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> > > probably memory access out of range >> > > [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > > [0]PETSC ERROR: or see >> > > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >> Mac OS >> > > X to find memory corruption errors >> > > [0]PETSC ERROR: likely location of problem given in stack below >> > > [0]PETSC ERROR: --------------------- Stack Frames >> > > ------------------------------------ >> > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> > > available, >> > > [0]PETSC ERROR: INSTEAD the line number of the start of the >> function >> > > [0]PETSC ERROR: is given. >> > > [0]PETSC ERROR: [0] MPIPetsc_Type_unwrap line 38 >> > > /scratch/lindad/moose/petsc/src/vec/is/sf/interface/sftype.c >> > > [0]PETSC ERROR: [1]PETSC ERROR: >> > > >> ------------------------------------------------------------------------ >> > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> > > probably memory access out of range >> > > [1]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > > [1]PETSC ERROR: or see >> > > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >> Mac OS >> > > X to find memory corruption errors >> > > [1]PETSC ERROR: likely location of problem given in stack below >> > > [1]PETSC ERROR: --------------------- Stack Frames >> > > ------------------------------------ >> > > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> > > available, >> > > [1]PETSC ERROR: INSTEAD the line number of the start of the >> function >> > > [1]PETSC ERROR: is given. >> > > [1]PETSC ERROR: [1] MPIPetsc_Type_unwrap line 38 >> > > /scratch/lindad/moose/petsc/src/vec/is/sf/interface/sftype.c >> > > [1]PETSC ERROR: [0] MPIPetsc_Type_compare line 71 >> > > /scratch/lindad/moose/petsc/src/vec/is/sf/interface/sftype.c >> > > [0]PETSC ERROR: [0] PetscSFPackGetInUse line 514 >> > > /scratch/lindad/moose/petsc/src/vec/is/sf/impls/basic/sfpack.c >> > > [0]PETSC ERROR: [0] PetscSFBcastAndOpEnd_Basic line 305 >> > > /scratch/lindad/moose/petsc/src/vec/is/sf/impls/basic/sfbasic.c >> > > [0]PETSC ERROR: [0] PetscSFBcastAndOpEnd line 1335 >> > > /scratch/lindad/moose/petsc/src/vec/is/sf/interface/sf.c >> > > [0]PETSC ERROR: [0] VecScatterEnd_SF line 83 >> > > /scratch/lindad/moose/petsc/src/vec/vscat/impls/sf/vscatsf.c >> > > [0]PETSC ERROR: [1] MPIPetsc_Type_compare line 71 >> > > /scratch/lindad/moose/petsc/src/vec/is/sf/interface/sftype.c >> > > [1]PETSC ERROR: [1] PetscSFPackGetInUse line 514 >> > > /scratch/lindad/moose/petsc/src/vec/is/sf/impls/basic/sfpack.c >> > > [0] VecScatterEnd line 145 >> > > /scratch/lindad/moose/petsc/src/vec/vscat/interface/vscatfce.c >> > > [0]PETSC ERROR: [0] DMGlobalToLocalEnd_DA line 25 >> > > /scratch/lindad/moose/petsc/src/dm/impls/da/dagtol.c >> > > [0]PETSC ERROR: [1]PETSC ERROR: [1] PetscSFBcastAndOpEnd_Basic line >> 305 >> > > /scratch/lindad/moose/petsc/src/vec/is/sf/impls/basic/sfbasic.c >> > > [1]PETSC ERROR: [1] PetscSFBcastAndOpEnd line 1335 >> > > /scratch/lindad/moose/petsc/src/vec/is/sf/interface/sf.c >> > > [0] DMGlobalToLocalEnd line 2368 >> > > /scratch/lindad/moose/petsc/src/dm/interface/dm.c >> > > [0]PETSC ERROR: [0] SNESComputeFunction_DMDA line 67 >> > > /scratch/lindad/moose/petsc/src/snes/utils/dmdasnes.c >> > > [0]PETSC ERROR: [1]PETSC ERROR: [1] VecScatterEnd_SF line 83 >> > > /scratch/lindad/moose/petsc/src/vec/vscat/impls/sf/vscatsf.c >> > > [0] MatFDColoringApply_AIJ line 180 >> > > /scratch/lindad/moose/petsc/src/mat/impls/aij/mpi/fdmpiaij.c >> > > [0]PETSC ERROR: [0] MatFDColoringApply line 610 >> > > /scratch/lindad/moose/petsc/src/mat/matfd/fdmatrix.c >> > > [0]PETSC ERROR: [1]PETSC ERROR: [1] VecScatterEnd line 145 >> > > /scratch/lindad/moose/petsc/src/vec/vscat/interface/vscatfce.c >> > > [1]PETSC ERROR: [1] DMGlobalToLocalEnd_DA line 25 >> > > /scratch/lindad/moose/petsc/src/dm/impls/da/dagtol.c >> > > [0] SNESComputeJacobian_DMDA line 153 >> > > /scratch/lindad/moose/petsc/src/snes/utils/dmdasnes.c >> > > [0]PETSC ERROR: [0] SNES user Jacobian function line 2678 >> > > /scratch/lindad/moose/petsc/src/snes/interface/snes.c >> > > [0]PETSC ERROR: [1]PETSC ERROR: [1] DMGlobalToLocalEnd line 2368 >> > > /scratch/lindad/moose/petsc/src/dm/interface/dm.c >> > > [1]PETSC ERROR: [1] SNESComputeFunction_DMDA line 67 >> > > /scratch/lindad/moose/petsc/src/snes/utils/dmdasnes.c >> > > [0] SNESComputeJacobian line 2637 >> > > /scratch/lindad/moose/petsc/src/snes/interface/snes.c >> > > [0]PETSC ERROR: [0] SNESSolve_NEWTONLS line 144 >> > > /scratch/lindad/moose/petsc/src/snes/impls/ls/ls.c >> > > [0]PETSC ERROR: [1]PETSC ERROR: [1] MatFDColoringApply_AIJ line 180 >> > > /scratch/lindad/moose/petsc/src/mat/impls/aij/mpi/fdmpiaij.c >> > > [1]PETSC ERROR: [1] MatFDColoringApply line 610 >> > > /scratch/lindad/moose/petsc/src/mat/matfd/fdmatrix.c >> > > [1]PETSC ERROR: [1] SNESComputeJacobian_DMDA line 153 >> > > /scratch/lindad/moose/petsc/src/snes/utils/dmdasnes.c >> > > [1]PETSC ERROR: [0] SNESSolve line 4366 >> > > /scratch/lindad/moose/petsc/src/snes/interface/snes.c >> > > [0]PETSC ERROR: [0] main line 108 ex19.c >> > > [1] SNES user Jacobian function line 2678 >> > > /scratch/lindad/moose/petsc/src/snes/interface/snes.c >> > > [1]PETSC ERROR: [0]PETSC ERROR: --------------------- Error Message >> > > -------------------------------------------------------------- >> > > [1] SNESComputeJacobian line 2637 >> > > /scratch/lindad/moose/petsc/src/snes/interface/snes.c >> > > [1]PETSC ERROR: [1] SNESSolve_NEWTONLS line 144 >> > > /scratch/lindad/moose/petsc/src/snes/impls/ls/ls.c >> > > [1]PETSC ERROR: [0]PETSC ERROR: Signal received >> > > [0]PETSC ERROR: See >> https://www.mcs.anl.gov/petsc/documentation/faq.html >> > > for trouble shooting. >> > > [0]PETSC ERROR: [1] SNESSolve line 4366 >> > > /scratch/lindad/moose/petsc/src/snes/interface/snes.c >> > > [1]PETSC ERROR: [1] main line 108 ex19.c >> > > Petsc Release Version 3.12.4, unknown >> > > [0]PETSC ERROR: ./ex19 on a arch-moose named lemhi2 by lindad Tue May >> 12 >> > > 12:54:11 2020 >> > > [0]PETSC ERROR: [1]PETSC ERROR: Configure options --download-hypre=1 >> > > --with-debugging=no --with-shared-libraries=1 --download-fblaslapack=1 >> > > --download-metis=1 --download-ptscotch=1 --download-parmetis=1 >> > > --download-superlu_dist=1 --download-mumps=1 --download-scalapack=1 >> > > --download-slepc=git://https://gitlab.com/slepc/slepc.git >> > > --download-slepc-commit= 59ff81b --with-mpi=1 --with-cxx-dialect=C++11 >> > > --with-fortran-bindings=0 --with-sowing=0 --with-cc=mpiicc >> > > --with-cxx=mpiicpc --with-fc=mpiifort --with-debugging=yes >> > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> > > --------------------- Error Message >> > > -------------------------------------------------------------- >> > > [1]PETSC ERROR: Signal received >> > > [1]PETSC ERROR: See >> https://www.mcs.anl.gov/petsc/documentation/faq.html >> > > for trouble shooting. >> > > [1]PETSC ERROR: Petsc Release Version 3.12.4, unknown >> > > [1]PETSC ERROR: Abort(59) on node 0 (rank 0 in comm 0): application >> called >> > > MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> > > ./ex19 on a arch-moose named lemhi2 by lindad Tue May 12 12:54:11 2020 >> > > [1]PETSC ERROR: Configure options --download-hypre=1 >> --with-debugging=no >> > > --with-shared-libraries=1 --download-fblaslapack=1 --download-metis=1 >> > > --download-ptscotch=1 --download-parmetis=1 --download-superlu_dist=1 >> > > --download-mumps=1 --download-scalapack=1 --download-slepc=git:// >> > > https://gitlab.com/slepc/slepc.git --download-slepc-commit= 59ff81b >> > > --with-mpi=1 --with-cxx-dialect=C++11 --with-fortran-bindings=0 >> > > --with-sowing=0 --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort >> > > --with-debugging=yes >> > > [1]PETSC ERROR: #1 User provided function() line 0 in unknown file >> > > [0]PETSC ERROR: >> > > >> ------------------------------------------------------------------------ >> > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >> the >> > > batch system) has told this process to end >> > > [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > > [0]PETSC ERROR: Abort(59) on node 1 (rank 1 in comm 0): application >> called >> > > MPI_Abort(MPI_COMM_WORLD, 59) - process 1 >> > > [1]PETSC ERROR: >> > > >> ------------------------------------------------------------------------ >> > > [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >> the >> > > batch system) has told this process to end >> > > [1]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > > [1]PETSC ERROR: or see >> > > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >> Mac OS >> > > X to find memory corruption errors >> > > [1]PETSC ERROR: likely location of problem given in stack below >> > > [1]PETSC ERROR: or see >> > > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >> Mac OS >> > > X to find memory corruption errors >> > > [0]PETSC ERROR: likely location of problem given in stack below >> > > [0]PETSC ERROR: --------------------- Stack Frames >> > > ------------------------------------ >> > > >> > >> > >> > >> >>
