I'm honestly stumped.

I have some petsc code that essentially just populates a matrix in parallel, 
then puts it in a file.  All my code that uses floating point computations is 
checked for NaN's and infinities and it doesn't seem to show up.  However, when 
I run it on more than 4 cores, I get floating point exceptions that kill the 
program.  I tried turning off the exceptions from petsc, but the program still 
dies from them, just without the petsc error message.

I honestly don't know where to go, I suppose I should attach a debugger, but 
I'm not sure how to do that for multi-processor code.

any ideas?  (long error message below):

-Andrew

[14]PETSC ERROR: 
------------------------------------------------------------------------
[14]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably 
divide by zero
[14]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[14]PETSC ERROR: or see 
http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[14]PETSC 
ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find 
memory corruption errors
[14]PETSC ERROR: likely location of problem given in stack below
[14]PETSC ERROR: ---------------------  Stack Frames 
------------------------------------
[14]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[14]PETSC ERROR:       INSTEAD the line number of the start of the function
[14]PETSC ERROR:       is given.
[14]PETSC ERROR: --------------------- Error Message 
------------------------------------
[14]PETSC ERROR: Signal received!
[14]PETSC ERROR: 
------------------------------------------------------------------------
[14]PE[15]PETSC ERROR: 
------------------------------------------------------------------------
[15]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably 
divide by zero
[15]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[15]PETSC ERROR: or see 
http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[15]PETSC 
ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find 
memory corruption errors
[15]PETSC ERROR: likely location of problem given in stack below
[15]PETSC ERROR: ---------------------  Stack Frames 
------------------------------------
[15]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[15]PETSC ERROR:       INSTEAD the line number of the start of the function
[15]PETSC ERROR:       is given.
[15]PETSC ERROR: --------------------- Error Message 
------------------------------------
[15]PETSC ERROR: Signal received!
[15]PETSC ERROR: 
------------------------------------------------------------------------
[15]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 CST 
2012 
[14]PETSC ERROR: See docs/changes/index.html for recent updates.
[14]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[14]PETSC ERROR: See docs/index.html for manual pages.
[14]PETSC ERROR: 
------------------------------------------------------------------------
[14]PETSC ERROR: /home/becker/ansp6066/local/bin/finddme on a linux-gnu named 
photon9.colorado.edu by ansp6066 Fri Apr 27 18:01:55 2012
[14]PETSC ERROR: Libraries linked from 
/home/becker/ansp6066/local/petsc-3.2-p6/lib
[14]PETSC ERROR: Configure run at Mon Feb 27 11:17:14 2012
[14]PETSC ERROR: Configure options 
--prefix=/home/becker/ansp6066/local/petsc-3.2-p6 --with-c++-support 
--with-fortran --with-mpi-dir=/usr/local/mpich2 --with-shared-libraries=0 
--with-scalar-type=complex 
--with-blas-lapack-libs=/central/intel/mkl/lib/em64t/libmkl_core.a 
--with-clanguage=cxx
[14]PETSC ERROR: 
------------------------------------------------------------------------
[14]TSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 CST 
2012 
[15]PETSC ERROR: See docs/changes/index.html for recent updates.
[15]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[15]PETSC ERROR: See docs/index.html for manual pages.
[15]PETSC ERROR: 
------------------------------------------------------------------------
[15]PETSC ERROR: /home/becker/ansp6066/local/bin/finddme on a linux-gnu named 
photon9.colorado.edu by ansp6066 Fri Apr 27 18:01:55 2012
[15]PETSC ERROR: Libraries linked from 
/home/becker/ansp6066/local/petsc-3.2-p6/lib
[15]PETSC ERROR: Configure run at Mon Feb 27 11:17:14 2012
[15]PETSC ERROR: Configure options 
--prefix=/home/becker/ansp6066/local/petsc-3.2-p6 --with-c++-support 
--with-fortran --with-mpi-dir=/usr/local/mpich2 --with-shared-libraries=0 
--with-scalar-type=complex 
--with-blas-lapack-libs=/central/intel/mkl/lib/em64t/libmkl_core.a 
--with-clanguage=cxx
[15]PETSC ERROR: 
------------------------------------------------------------------------
[15]PETSC ERROR: User provided function() line 0 in unknown directory unknown 
file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 14PETSC ERROR: User 
provided function() line 0 in unknown directory unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 15[0]0:Return code = 
0, signaled with Interrupt

Reply via email to