Also, maybe run with -info dump and grep for MUMPS errors in dump.%p, because some failures are silent otherwise.
Thanks, Pierre > On 22 Mar 2021, at 7:09 PM, Matthew Knepley <[email protected]> wrote: > > On Mon, Mar 22, 2021 at 2:07 PM Chris Hewson <[email protected] > <mailto:[email protected]>> wrote: > Hi Matt, > > No, we are running it without debugging in prod and then running debug I > can't reproduce the error, from stderr we get: > > [1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see > https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > <https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind> > [1]PETSC ERROR: or try http://valgrind.org <http://valgrind.org/> on > GNU/linux and Apple Mac OS X to find memory corruption errors > [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > [1]PETSC ERROR: to get more information on the crash. > [1]PETSC ERROR: Run with -malloc_debug to check if memory corruption is > causing the crash. > application called MPI_Abort(MPI_COMM_WORLD, 50176059) - process 1 > > If you can afford it, running an instance with -on_error_attach_debugger so > that if it fails we can get a stack trace, would be > very valuable, since right now we do not know exactly what is failing. > > Thanks, > > Matt > > Chris Hewson > Senior Reservoir Simulation Engineer > ResFrac > +1.587.575.9792 > > > On Mon, Mar 22, 2021 at 12:04 PM Matthew Knepley <[email protected] > <mailto:[email protected]>> wrote: > On Mon, Mar 22, 2021 at 1:56 PM Chris Hewson <[email protected] > <mailto:[email protected]>> wrote: > Hi All, > > I have been having a problem with MUMPS randomly crashing in our program and > causing the entire program to crash. I am compiling in -O2 optimization mode > and using --download-mumps etc. to compile PETSc. If I rerun the program, > 95%+ of the time I can't reproduce the error. It seems to be a similar issue > to this thread: > > https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html > <https://lists.mcs.anl.gov/pipermail/petsc-users/2018-October/036372.html> > > Similar to the resolution there I am going to try and increase icntl_14 and > see if that resolves the issue. Any other thoughts on this? > > When it fails, do you get a stack trace? > > Thanks, > > Matt > > Thanks, > > Chris Hewson > Senior Reservoir Simulation Engineer > ResFrac > +1.587.575.9792 > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/> > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
