Edit PETSC_ARCH/include/petscconf.h and add #if !defined(PETSC_MISSING_SIGTRAP) #define PETSC_MISSING_SIGTRAP #endif
then do make gnumake It is possible that they system you are using uses SIGTRAP in managing the IO; by making the change above you are telling PETSc to ignore SIGTRAPS. Let us know how this works out. Barry > On Nov 27, 2015, at 1:05 PM, Fande Kong <[email protected]> wrote: > > Hi all, > > I implemented a parallel IO based on the Vec and IS which uses HDF5. I am > testing this loader on a supercomputer. I occasionally (not always) encounter > the following errors (using 8192 cores): > > [7689]PETSC ERROR: > ------------------------------------------------------------------------ > [7689]PETSC ERROR: Caught signal number 5 TRAP > [7689]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [7689]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [7689]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X > to find memory corruption errors > [7689]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > [7689]PETSC ERROR: to get more information on the crash. > [7689]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [7689]PETSC ERROR: Signal received > [7689]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [7689]PETSC ERROR: Petsc Release Version 3.6.2, unknown > [7689]PETSC ERROR: ./fsi on a arch-linux2-cxx-opt named ys6103 by fandek Fri > Nov 27 11:26:30 2015 > [7689]PETSC ERROR: Configure options --with-clanguage=cxx > --with-shared-libraries=1 --download-fblaslapack=1 --with-mpi=1 > --download-parmetis=1 --download-metis=1 --with-netcdf=1 > --download-exodusii=1 > --with-hdf5-dir=/glade/apps/opt/hdf5-mpi/1.8.12/intel/12.1.5 > --with-debugging=no --with-c2html=0 --with-64-bit-indices=1 > [7689]PETSC ERROR: #1 User provided function() line 0 in unknown file > Abort(59) on node 7689 (rank 7689 in comm 1140850688): application called > MPI_Abort(MPI_COMM_WORLD, 59) - process 7689 > ERROR: 0031-300 Forcing all remote tasks to exit due to exit code 1 in task > 7689 > > Make and configure logs are attached. > > Thanks, > > Fande Kong, > > <configure_log><make_log>
