This looks like an error in MUMPS: IF ( IROW_GRID .NE. root%MYROW .OR. & JCOL_GRID .NE. root%MYCOL ) THEN WRITE(*,*) MYID,':INTERNAL Error: recvd root arrowhead '
On Mon, Apr 8, 2019 at 1:37 PM Smith, Barry F. via petsc-users < petsc-users@mcs.anl.gov> wrote: > Difficult to tell what is going on. > > The message User provided function() line 0 in unknown file indicates > the crash took place OUTSIDE of PETSc code and error message INTERNAL > Error: recvd root arrowhead is definitely not coming from PETSc. > > Yes, debug with the debug version and also try valgrind. > > Barry > > > > On Apr 8, 2019, at 12:12 PM, Manav Bhatia via petsc-users < > petsc-users@mcs.anl.gov> wrote: > > > > > > Hi, > > > > I am running a code a nonlinear simulation using mesh-refinement on > libMesh. The code runs without issues on a Mac (can run for days without > issues), but crashes on Linux (Centos 6). I am using version 3.11 on Linux > with openmpi 3.1.3 and gcc8.2. > > > > I tried to use the -on_error_attach_debugger, but it only gave me > this message. Does this message imply something to the more experienced > eyes? > > > > I am going to try to build a debug version of petsc to figure out > what is going wrong. I will get and share more detailed logs in a bit. > > > > Regards, > > Manav > > > > [8]PETSC ERROR: > ------------------------------------------------------------------------ > > [8]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [8]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [8]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [8]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > [8]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > [8]PETSC ERROR: to get more information on the crash. > > [8]PETSC ERROR: User provided function() line 0 in unknown file > > PETSC: Attaching gdb to > /cavs/projects/brg_codes/users/bhatia/mast/mast_topology/opt/examples/structural/example_5/structural_example_5 > of pid 2108 on display localhost:10.0 on machine Warhawk1.HPC.MsState.Edu > > PETSC: Attaching gdb to > /cavs/projects/brg_codes/users/bhatia/mast/mast_topology/opt/examples/structural/example_5/structural_example_5 > of pid 2112 on display localhost:10.0 on machine Warhawk1.HPC.MsState.Edu > > 0 :INTERNAL Error: recvd root arrowhead > > 0 :not belonging to me. IARR,JARR= 67525 67525 > > 0 :IROW_GRID,JCOL_GRID= 0 4 > > 0 :MYROW, MYCOL= 0 0 > > 0 :IPOSROOT,JPOSROOT= 92264688 92264688 > > > -------------------------------------------------------------------------- > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > > with errorcode -99. > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > You may or may not see output from other processes, depending on > > exactly when Open MPI kills them. > > > -------------------------------------------------------------------------- > > > >