Dear all, I am solving with cholesky finite element problems where I have to solve multiple times. To improve the performance I use KSPSetOperators and set the option SAME_NONZERO_PATTERN because the matrix structure is always the same in successive solves. This works great as long I use more than one processor. When I use MUMPS, only one processor and a matrix made with MATSEQSBAIJ the solving segfaults in the second solve. When I use spooles or use LU with AIJ it also works, but I need MUMPS as MatGetInertia only works with MUMPS with multiple processors.
The segfault: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] MatFactorNumeric_MUMPS line 642 src/mat/impls/aij/mpi/mumps/mumps.c [0]PETSC ERROR: [0] MatCholeskyFactorNumeric line 3037 src/mat/interface/matrix.c [0]PETSC ERROR: [0] PCSetUp_Cholesky line 92 src/ksp/pc/impls/factor/cholesky/cholesky.c [0]PETSC ERROR: [0] PCSetUp line 797 src/ksp/pc/interface/precon.c [0]PETSC ERROR: [0] KSPSetUp line 184 src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: [0] KSPSolve line 331 src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 6, Wed Jan 11 09:28:45 CST 2012 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./icona on a linux-gnu named sunray.dps.uibk.ac.at by manfred Thu Mar 1 10:06:47 2012 [0]PETSC ERROR: Libraries linked from /home/manfred/petsc-3.2-p6/linux-gnu-c/lib [0]PETSC ERROR: Configure run at Thu Mar 1 09:11:08 2012 [0]PETSC ERROR: Configure options --with-cc=/home/manfred/gcc-4.6.2/bin/gcc --download-mpich=1 --with-blas-lapack=1 --download-f-blas-lapack=1 PETSC_ARCH=linux-gnu-c --with-clanguage=cxx --with-cxx=/home/manfred/gcc-4.6.2/bin/g++ --with-fc=/home/manfred/gcc-4.6.2/bin/gfortran --with-scalapack=1 --download-scalapack=1 --with-blacs=1 --download-blacs=1 --with-parmetis=1 --download-parmetis=1 --with-spooles=1 --download-spooles=1 --with-mumps=1 --download-mumps=1 --with-blopex=1 --download-blopex=1 --with-debugging=1 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [cli_0]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 with valgrind I get the error ==6853== Invalid read of size 8 ==6853== at 0x11CDDAA: dmumps_148_ (dmumps_part1.F:1299) ==6853== by 0x111093C: dmumps_142_ (dmumps_part5.F:5094) ==6853== by 0x11C9754: dmumps_ (dmumps_part1.F:642) ==6853== by 0x10B4DDB: dmumps_f77_ (dmumps_part3.F:6651) ==6853== by 0x108E0FF: dmumps_c (mumps_c.c:422) ==6853== by 0xAE6EB3: MatFactorNumeric_MUMPS(_p_Mat*, _p_Mat*, MatFactorInfo const*) (mumps.c:667) ==6853== by 0x9832D7: MatCholeskyFactorNumeric(_p_Mat*, _p_Mat*, MatFactorInfo const*) (matrix.c:3050) ==6853== by 0xDFFE2D: PCSetUp_Cholesky(_p_PC*) (cholesky.c:157) ==6853== by 0xD7B390: PCSetUp(_p_PC*) (precon.c:819) ==6853== by 0xE47BD2: KSPSetUp(_p_KSP*) (itfunc.c:260) ==6853== by 0xE48C47: KSPSolve(_p_KSP*, _p_Vec*, _p_Vec*) (itfunc.c:379) ==6853== by 0x4914E0: Task::runCalc() (Task.cpp:1204) Is this a bug or doing I am something wrong? Should I go to MUMPS with this error? I have currently a workaround to simply not use SAME_NONZERO_PATTERN if using only one processor but as a result it is much slower in successive solves. Thank you in advance Manfred Gratt
