Hi,

My code was found to be giving error answer in one of the cluster, even on single processor. No error msg was given. It used to be working fine.

I run the debug version and it gives the error msg:

[0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 8 FPE: Floating Point Exception,probably divide by zero
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: --------------------- Stack Frames ------------------------------------
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR:       INSTEAD the line number of the start of the function
[0]PETSC ERROR:       is given.
[0]PETSC ERROR: [0] VecDot_Seq line 62 src/vec/vec/impls/seq/bvec1.c
[0]PETSC ERROR: [0] VecDot_MPI line 14 src/vec/vec/impls/mpi/pbvec.c
[0]PETSC ERROR: [0] VecDot line 118 src/vec/vec/interface/rvector.c
[0]PETSC ERROR: [0] KSPSolve_BCGS line 39 src/ksp/ksp/impls/bcgs/bcgs.c
[0]PETSC ERROR: [0] KSPSolve line 356 src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Signal received!
[0]PETSC ERROR: ------------------------------------------------------------------------

It happens after KSPSolve. There was no problem on other cluster. So how should I debug to find the error?

I tried to compare the input matrix and vector between different cluster but there are too many values.

--
Thank you

Yours sincerely,

TAY wee-beng

Reply via email to