Does the PETSc example src/vec/vec/examples/tutorials/ex1.c run correctly on 8+ processes?
Are you sure the MPI shared libraries are the same on both systems? You can try the option -on_error_attach_debugger Barry On Aug 5, 2011, at 4:41 PM, Dominik Szczerba wrote: > I have a 2x6core. My solver works fine only on up to 8 processes, > above that it always crashes with the below cited error. I did not yet > valgrind etc. because I am in a desperate need to fix it quickly. I am > just wondering what can potentially be the culprit. > > PS. I am not using MPI_Allreduce anywhere in my code. > > Many thanks for any hints, > Dominik > > Fatal error in MPI_Allreduce: Error message texts are not > available[cli_9]: aborting job: > Fatal error in MPI_Allreduce: Error message texts are not available > Fatal error in MPI_Allreduce: Error message texts are not > available[cli_1]: aborting job: > Fatal error in MPI_Allreduce: Error message texts are not available > Fatal error in MPI_Allreduce: Error message texts are not > available[cli_7]: aborting job: > Fatal error in MPI_Allreduce: Error message texts are not available > INTERNAL ERROR: Invalid error class (66) encountered while returning from > MPI_Allreduce. Please file a bug report. No error stack is available. > Fatal error in MPI_Allreduce: Error message texts are not > available[cli_11]: aborting job: > Fatal error in MPI_Allreduce: Error message texts are not available > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 3 Quit: Some other process (or > the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to > find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 462 > src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: [0] MatAssemblyBegin line 4553 src/mat/interface/matrix.c > [0]PETSC ERROR: [0] User provided functi[2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Caught signal number 3 Quit: Some other process (or > the batch system) has told this process to end > [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [2]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[2]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to > find memory corruption errors > [2]PETSC ERROR: likely location of problem given in stack below > [2]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [2]PETSC ERROR: INSTEAD the line number of the start of the function > [2]PETSC ERROR: is given. > [2]PETSC ERROR: [2] VecAssemblyBegin line 157 src/vec/vec/interface/vector.c > [2]PETSC ERROR: [2] User provided function line 160 > "unknowndirectory/"/home/dsz/src/framework/trunk/solve/SM3T4mpi.cxx > [INTERNAL ERROR: Invalid error class (66) encountered while returning from > MPI_Allreduce. Please file a bug report. No error stack is available. > Fatal error in MPI_Allreduce: Error message texts are not > available[cli_3]: aborting job: > Fatal error in MPI_Allreduce: Error message texts are not available > [4]PETSC ERROR: > ------------------------------------------------------------------------ > [4]PETSC ERROR: Caught signal number 3 Quit: Some other process (or > the batch system) has told this process to end > [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [4]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[4]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to > find memory corruption errors > [4]PETSC ERROR: likely location of problem given in stack below > [4]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [4]PETSC ERROR: INSTEAD the line number of the start of the function > [4]PETSC ERROR: is given. > [4]PETSC ERROR: [4] MatAssemblyBegin_MPIAIJ line 462 > src/mat/impls/aij/mpi/mpiaij.c > [4]PETSC ERROR: [4] MatAssemblyBegin line 4553 src/mat/interface/matrix.c > [4]PETSC ERROR: [4] User provided functiINTERNAL ERROR: Invalid error > class (66) encountered while returning from > MPI_Allreduce. Please file a bug report. No error stack is available. > Fatal error in MPI_Allreduce: Error message texts are not > available[cli_5]: aborting job: > Fatal error in MPI_Allreduce: Error message texts are not available > [6]PETSC ERROR: > ------------------------------------------------------------------------ > [6]PETSC ERROR: Caught signal number 3 Quit: Some other process (or > the batch system) has told this process to end > [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [6]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[6]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to > find memory corruption errors > [6]PETSC ERROR: likely location of problem given in stack below > [6]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [6]PETSC ERROR: INSTEAD the line number of the start of the function > [6]PETSC ERROR: is given. > [6]PETSC ERROR: [6] MatAssemblyBegin_MPIAIJ line 462 > src/mat/impls/aij/mpi/mpiaij.c > [6]PETSC ERROR: [6] MatAssemblyBegin line 4553 src/mat/interface/matrix.c > [6]PETSC ERROR: [6] User provided functiINTERNAL ERROR: Invalid error > class (66) encountered while returning from > MPI_Allreduce. Please file a bug report. No error stack is available. > Fatal error in MPI_Allreduce: Error message texts are not > available[cli_8]: aborting job: > Fatal error in MPI_Allreduce: Error message texts are not available > [10]PETSC ERROR: > ------------------------------------------------------------------------ > [10]PETSC ERROR: Caught signal number 3 Quit: Some other process (or > the batch system) has told this process to end > [10]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [10]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[10]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to > find memory corruption errors > [10]PETSC ERROR: likely location of problem given in stack below > [10]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [10]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [10]PETSC ERROR: INSTEAD the line number of the start of the function > [10]PETSC ERROR: is given. > [10]PETSC ERROR: [10] VecAssemblyBegin line 157 src/vec/vec/interface/vector.c > [10]PETSC ERROR: [10] User provided function line 160 > "unknowndirectory/"/home/dsz/src/framework/trunk/solve/on line 294 > "unknowndirectory/"/home/dsz/src/framework/trunk/solve/SM3T4mpi.cxx > [0]PETSC ERROR: [0] User provided function line 627 > "unknowndirectory/"/home/dsz/src/framework/trunk/solve/SM3T4mpi.cxx > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 > 13:37:48 CDT 2011 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Unknown Name on a linux-gnu named nexo by dsz Sat Aug > 6 00:35:58 2011 > [0]PETSC ERROR: Libraries linked from > /home/dsz/pack/petsc-3.1-p8/linux-gnu-c-debug/lib > [0]PETSC ERROR: Configure run at Sat Aug 6 00:02:58 2011 > [0]PETSC ERROR: Config2]PETSC ERROR: [2] User provided function line > 294 "unknowndirectory/"/home/dsz/src/framework/trunk/solve/SM3T4mpi.cxx > [2]PETSC ERROR: [2] User provided function line 627 > "unknowndirectory/"/home/dsz/src/framework/trunk/solve/SM3T4mpi.cxx > [2]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [2]PETSC ERROR: Signal received! > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 > 13:37:48 CDT 2011 > [2]PETSC ERROR: See docs/changes/index.html for recent updates. > [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [2]PETSC ERROR: See docs/index.html for manual pages. > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: Unknown Name on a linux-gnu named nexo by dsz Sat Aug > 6 00:35:58 2011 > [2]PETSC ERROR: Libraries linked from > /home/dsz/pack/petsc-3.1-p8/linux-gnu-c-debug/lib > [2]PETSC ERROR: Configure run at Sat Aug on line 294 > "unknowndirectory/"/home/dsz/src/framework/trunk/solve/SM3T4mpi.cxx > [4]PETSC ERROR: [4] User provided function line 627 > "unknowndirectory/"/home/dsz/src/framework/trunk/solve/SM3T4mpi.cxx > [4]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [4]PETSC ERROR: Signal received! > [4]PETSC ERROR: > ------------------------------------------------------------------------ > [4]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 > 13:37:48 CDT 2011 > [4]PETSC ERROR: See docs/changes/index.html for recent updates. > [4]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [4]PETSC ERROR: See docs/index.html for manual pages. > [4]PETSC ERROR: > ------------------------------------------------------------------------ > [4]PETSC ERROR: Unknown Name on a linux-gnu named nexo by dsz Sat Aug > 6 00:35:58 2011 > [4]PETSC ERROR: Libraries linked from > /home/dsz/pack/petsc-3.1-p8/linux-gnu-c-debug/lib > [4]PETSC ERROR: Configure run at Sat Aug 6 00:02:58 2011 > [4]PETSC ERROR: Configon line 294 > "unknowndirectory/"/home/dsz/src/framework/trunk/solve/SM3T4mpi.cxx > [6]PETSC ERROR: [6] User provided function line 627 > "unknowndirectory/"/home/dsz/src/framework/trunk/solve/SM3T4mpi.cxx > [6]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [6]PETSC ERROR: Signal received! > [6]PETSC ERROR: > ------------------------------------------------------------------------ > [6]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 > 13:37:48 CDT 2011 > [6]PETSC ERROR: See docs/changes/index.html for recent updates. > [6]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [6]PETSC ERROR: See docs/index.html for manual pages. > [6]PETSC ERROR: > ------------------------------------------------------------------------ > [6]PETSC ERROR: Unknown Name on a linux-gnu named nexo by dsz Sat Aug > 6 00:35:58 2011 > [6]PETSC ERROR: Libraries linked from > /home/dsz/pack/petsc-3.1-p8/linux-gnu-c-debug/lib > [6]PETSC ERROR: Configure run at Sat Aug 6 00:02:58 2011 > [6]PETSC ERROR: ConfigSM3T4mpi.cxx > [10]PETSC ERROR: [10] User provided function line 294 > "unknowndirectory/"/home/dsz/src/framework/trunk/solve/SM3T4mpi.cxx > [10]PETSC ERROR: [10] User provided function line 627 > "unknowndirectory/"/home/dsz/src/framework/trunk/solve/SM3T4mpi.cxx > [10]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [10]PETSC ERROR: Signal received! > [10]PETSC ERROR: > ------------------------------------------------------------------------ > [10]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 > 13:37:48 CDT 2011 > [10]PETSC ERROR: See docs/changes/index.html for recent updates. > [10]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [10]PETSC ERROR: See docs/index.html for manual pages. > [10]PETSC ERROR: > ------------------------------------------------------------------------ > [10]PETSC ERROR: Unknown Name on a linux-gnu named nexo by dsz Sat Aug > 6 00:35:58 2011 > [10]PETSC ERROR: Libraries linked from > /home/dsz/pack/petsc-3.1-p8/linux-gnu-c-debug/lib > [10]PETSC ERRure options PETSC_DIR=/home/dsz/pack/petsc-3.1-p8 > PETSC_ARCH=linux-gnu-c-debug --download-f-blas-lapack=1 > --download-mpich=1 --download-hypre=1 --with-parmetis=1 > --download-parmetis=1 --with-x=0 --with-debugging=1 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0[cli_0]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > 6 00:02:58 2011 > [2]PETSC ERROR: Configure options > PETSC_DIR=/home/dsz/pack/petsc-3.1-p8 PETSC_ARCH=linux-gnu-c-debug > --download-f-blas-lapack=1 --download-mpich=1 --download-hypre=1 > --with-parmetis=1 --download-parmetis=1 --with-x=0 --with-debugging=1 > [2]PETSC ERROR: > ------------------------------------------------------------------------ > [2]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 2[cli_2]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 2 > ure options PETSC_DIR=/home/dsz/pack/petsc-3.1-p8 > PETSC_ARCH=linux-gnu-c-debug --download-f-blas-lapack=1 > --download-mpich=1 --download-hypre=1 --with-parmetis=1 > --download-parmetis=1 --with-x=0 --with-debugging=1 > [4]PETSC ERROR: > ------------------------------------------------------------------------ > [4]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 4[cli_4]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 4 > ure options PETSC_DIR=/home/dsz/pack/petsc-3.1-p8 > PETSC_ARCH=linux-gnu-c-debug --download-f-blas-lapack=1 > --download-mpich=1 --download-hypre=1 --with-parmetis=1 > --download-parmetis=1 --with-x=0 --with-debugging=1 > [6]PETSC ERROR: > ------------------------------------------------------------------------ > [6]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 6[cli_6]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 6 > OR: Configure run at Sat Aug 6 00:02:58 2011 > [10]PETSC ERROR: Configure options > PETSC_DIR=/home/dsz/pack/petsc-3.1-p8 PETSC_ARCH=linux-gnu-c-debug > --download-f-blas-lapack=1 --download-mpich=1 --download-hypre=1 > --with-parmetis=1 --download-parmetis=1 --with-x=0 --with-debugging=1 > [10]PETSC ERROR: > ------------------------------------------------------------------------ > [10]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 10[cli_10]: > aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 10
