Yes, I have found an error in my matrix... Thank you all for the useful hints! Still, I wonder if there are some more efficient ways to set up bug traps to get get the backtrace leading to the real problem and not to the innocent parts... <sigh/>.
With regards, Dominik Barry Smith wrote: > If you run without hypre preconditioner but use instead, say > bjacobi under valgrind do you get any valgrind errors? > > The problem you are having could be do to (1) some memory > corruption in your code that is messing up hypre or (2) some bug in > hypre that we don't see with our simple test codes. > > Barry > > On Nov 14, 2009, at 4:41 PM, Dominik Szczerba wrote: > >> No I am using Hypre built automatically along with petsc... >> I will try ex10, thanks... >> >> Matthew Knepley wrote: >>> This is already bad. You had an Invalid Read and Invalid Write in >>> your Hypre. Did you build it >>> yourself? If so, let us build it. If not, please try your matrix on >>> KSP ex10 and see if you get a >>> crash on 2 procs. >>> Thanks, >>> Matt >>> On Sat, Nov 14, 2009 at 3:51 PM, Dominik Szczerba <dominik at itis.ethz.ch >>> <mailto:dominik at itis.ethz.ch>> wrote: >>> run onlu in single, he says things like below - but does not >>> crash. >>> Also, the program run with -np 1 does not crash. No clear idea >>> though about valgrind's output, please advise if this tells you >>> anything... >>> Call from NS3T10::createSolverContexts() referenced therein is: >>> ierr = KSPCreate(petsc_comm,&kspSchurVelocity);CHKERRQ(ierr); >>> ==2605== Conditional jump or move depends on uninitialised >>> value(s) >>> ==2605== at 0x8AE720F: hypre_BoomerAMGSetPlotFileName >>> (par_amg.c:2115) >>> ==2605== by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276) >>> ==2605== by 0x8AE4A71: HYPRE_BoomerAMGCreate >>> (HYPRE_parcsr_amg.c:31) >>> ==2605== by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850) >>> ==2605== by 0x8563068: PCHYPRESetType (hypre.c:964) >>> ==2605== by 0x80E67BB: NS3T10::createSolverContexts() >>> (NS3T10mpi.cxx:1980) >>> ==2605== by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306) >>> ==2605== by 0x8104860: main (ns3t10mpi_main.cxx:1516) >>> ==2605== >>> ==2605== Conditional jump or move depends on uninitialised >>> value(s) >>> ==2605== at 0x8AE7244: hypre_BoomerAMGSetPlotFileName >>> (par_amg.c:2120) >>> ==2605== by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276) >>> ==2605== by 0x8AE4A71: HYPRE_BoomerAMGCreate >>> (HYPRE_parcsr_amg.c:31) >>> ==2605== by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850) >>> ==2605== by 0x8563068: PCHYPRESetType (hypre.c:964) >>> ==2605== by 0x80E67BB: NS3T10::createSolverContexts() >>> (NS3T10mpi.cxx:1980) >>> ==2605== by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306) >>> ==2605== by 0x8104860: main (ns3t10mpi_main.cxx:1516) >>> ==2605== >>> ==2605== Conditional jump or move depends on uninitialised >>> value(s) >>> ==2605== at 0x4025C16: strcpy (mc_replace_strmem.c:303) >>> ==2605== by 0x8AE727A: hypre_BoomerAMGSetPlotFileName >>> (par_amg.c:2123) >>> ==2605== by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276) >>> ==2605== by 0x8AE4A71: HYPRE_BoomerAMGCreate >>> (HYPRE_parcsr_amg.c:31) >>> ==2605== by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850) >>> ==2605== by 0x8563068: PCHYPRESetType (hypre.c:964) >>> ==2605== by 0x80E67BB: NS3T10::createSolverContexts() >>> (NS3T10mpi.cxx:1980) >>> ==2605== by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306) >>> ==2605== by 0x8104860: main (ns3t10mpi_main.cxx:1516) >>> ==2605== >>> ==2605== Conditional jump or move depends on uninitialised >>> value(s) >>> ==2605== at 0x4025C35: strcpy (mc_replace_strmem.c:303) >>> ==2605== by 0x8AE727A: hypre_BoomerAMGSetPlotFileName >>> (par_amg.c:2123) >>> ==2605== by 0x8AE7ED9: hypre_BoomerAMGCreate (par_amg.c:276) >>> ==2605== by 0x8AE4A71: HYPRE_BoomerAMGCreate >>> (HYPRE_parcsr_amg.c:31) >>> ==2605== by 0x8562019: PCHYPRESetType_HYPRE (hypre.c:850) >>> ==2605== by 0x8563068: PCHYPRESetType (hypre.c:964) >>> ==2605== by 0x80E67BB: NS3T10::createSolverContexts() >>> (NS3T10mpi.cxx:1980) >>> ==2605== by 0x80EA63B: NS3T10::solve() (NS3T10mpi.cxx:2306) >>> ==2605== by 0x8104860: main (ns3t10mpi_main.cxx:1516) >>> ==2605== >>> Solver contexts created in 2.520000 s >>> Starting KSPSolve (0/1) >>> 0 KSP Residual norm 8.368803253774e-06 >>> ==2605== Invalid read of size 8 >>> ==2605== at 0x8B23B5A: hypre_BoomerAMGCreateS (par_strength.c: >>> 223) >>> ==2605== by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: >>> 630) >>> ==2605== by 0x8AE4A4D: HYPRE_BoomerAMGSetup >>> (HYPRE_parcsr_amg.c:58) >>> ==2605== by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134) >>> ==2605== by 0x86256A9: PCSetUp (precon.c:794) >>> ==2605== by 0x85A6E62: KSPSetUp (itfunc.c:237) >>> ==2605== by 0x85A7EAB: KSPSolve (itfunc.c:353) >>> ==2605== by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*) >>> (NS3T10mpi.cxx:3741) >>> ==2605== by 0x851C47E: PCApply_Shell (shellpc.c:129) >>> ==2605== by 0x862074E: PCApply (precon.c:357) >>> ==2605== by 0x863AC4C: KSPInitialResidual (itres.c:64) >>> ==2605== by 0x85EB09A: KSPSolve_GMRES (gmres.c:241) >>> ==2605== Address 0xafae5d0 is 0 bytes after a block of size >>> 93,488 >>> alloc'd >>> ==2605== at 0x4023F5B: calloc (vg_replace_malloc.c:418) >>> ==2605== by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121) >>> ==2605== by 0x8B4CA67: hypre_CSRMatrixInitialize >>> (csr_matrix.c:91) >>> ==2605== by 0x8B32EC8: hypre_ParCSRMatrixInitialize >>> (par_csr_matrix.c:200) >>> ==2605== by 0x8AE0C44: hypre_IJMatrixInitializeParCSR >>> (IJMatrix_parcsr.c:272) >>> ==2605== by 0x8ADBE09: HYPRE_IJMatrixInitialize >>> (HYPRE_IJMatrix.c:302) >>> ==2605== by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ >>> (mhyp.c:174) >>> ==2605== by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131) >>> ==2605== by 0x855A445: PCSetUp_HYPRE (hypre.c:130) >>> ==2605== by 0x86256A9: PCSetUp (precon.c:794) >>> ==2605== by 0x85A6E62: KSPSetUp (itfunc.c:237) >>> ==2605== by 0x85A7EAB: KSPSolve (itfunc.c:353) >>> ==2605== >>> ==2605== Invalid write of size 4 >>> ==2605== at 0x8B23E0C: hypre_BoomerAMGCreateS (par_strength.c: >>> 301) >>> ==2605== by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: >>> 630) >>> ==2605== by 0x8AE4A4D: HYPRE_BoomerAMGSetup >>> (HYPRE_parcsr_amg.c:58) >>> ==2605== by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134) >>> ==2605== by 0x86256A9: PCSetUp (precon.c:794) >>> ==2605== by 0x85A6E62: KSPSetUp (itfunc.c:237) >>> ==2605== by 0x85A7EAB: KSPSolve (itfunc.c:353) >>> ==2605== by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*) >>> (NS3T10mpi.cxx:3741) >>> ==2605== by 0x851C47E: PCApply_Shell (shellpc.c:129) >>> ==2605== by 0x862074E: PCApply (precon.c:357) >>> ==2605== by 0x863AC4C: KSPInitialResidual (itres.c:64) >>> ==2605== by 0x85EB09A: KSPSolve_GMRES (gmres.c:241) >>> ==2605== Address 0xb12a050 is 0 bytes after a block of size >>> 46,744 >>> alloc'd >>> ==2605== at 0x4023F5B: calloc (vg_replace_malloc.c:418) >>> ==2605== by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121) >>> ==2605== by 0x8B23980: hypre_BoomerAMGCreateS (par_strength.c: >>> 163) >>> ==2605== by 0x8AE966F: hypre_BoomerAMGSetup (par_amg_setup.c: >>> 630) >>> ==2605== by 0x8AE4A4D: HYPRE_BoomerAMGSetup >>> (HYPRE_parcsr_amg.c:58) >>> ==2605== by 0x855A5D9: PCSetUp_HYPRE (hypre.c:134) >>> ==2605== by 0x86256A9: PCSetUp (precon.c:794) >>> ==2605== by 0x85A6E62: KSPSetUp (itfunc.c:237) >>> ==2605== by 0x85A7EAB: KSPSolve (itfunc.c:353) >>> ==2605== by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*) >>> (NS3T10mpi.cxx:3741) >>> ==2605== by 0x851C47E: PCApply_Shell (shellpc.c:129) >>> ==2605== by 0x862074E: PCApply (precon.c:357) >>> ==2605== >>> ... >>> ==2605== Invalid read of size 8 >>> ==2605== at 0x8B1ACE8: hypre_BoomerAMGRelax (par_relax.c:182) >>> ==2605== by 0x8B1DFBF: hypre_BoomerAMGRelaxIF >>> (par_relax_interface.c:110) >>> ==2605== by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386) >>> ==2605== by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c: >>> 252) >>> ==2605== by 0x8AE4A25: HYPRE_BoomerAMGSolve >>> (HYPRE_parcsr_amg.c:76) >>> ==2605== by 0x855AAA4: PCApply_HYPRE (hypre.c:172) >>> ==2605== by 0x862074E: PCApply (precon.c:357) >>> ==2605== by 0x8606095: KSPSolve_PREONLY (preonly.c:29) >>> ==2605== by 0x85A85D3: KSPSolve (itfunc.c:385) >>> ==2605== by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*) >>> (NS3T10mpi.cxx:3741) >>> ==2605== by 0x851C47E: PCApply_Shell (shellpc.c:129) >>> ==2605== by 0x862074E: PCApply (precon.c:357) >>> ==2605== Address 0xafae5d0 is 0 bytes after a block of size >>> 93,488 >>> alloc'd >>> ==2605== at 0x4023F5B: calloc (vg_replace_malloc.c:418) >>> ==2605== by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121) >>> ==2605== by 0x8B4CA67: hypre_CSRMatrixInitialize >>> (csr_matrix.c:91) >>> ==2605== by 0x8B32EC8: hypre_ParCSRMatrixInitialize >>> (par_csr_matrix.c:200) >>> ==2605== by 0x8AE0C44: hypre_IJMatrixInitializeParCSR >>> (IJMatrix_parcsr.c:272) >>> ==2605== by 0x8ADBE09: HYPRE_IJMatrixInitialize >>> (HYPRE_IJMatrix.c:302) >>> ==2605== by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ >>> (mhyp.c:174) >>> ==2605== by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131) >>> ==2605== by 0x855A445: PCSetUp_HYPRE (hypre.c:130) >>> ==2605== by 0x86256A9: PCSetUp (precon.c:794) >>> ==2605== by 0x85A6E62: KSPSetUp (itfunc.c:237) >>> ==2605== by 0x85A7EAB: KSPSolve (itfunc.c:353) >>> ==2605== >>> ... >>> 0 KSP Residual norm 8.368803253774e-06 >>> ==2605== Invalid read of size 8 >>> ==2605== at 0x8B1ADC0: hypre_BoomerAMGRelax (par_relax.c:196) >>> ==2605== by 0x8B1DFBF: hypre_BoomerAMGRelaxIF >>> (par_relax_interface.c:110) >>> ==2605== by 0x8AFC310: hypre_BoomerAMGCycle (par_cycle.c:386) >>> ==2605== by 0x8AEE09E: hypre_BoomerAMGSolve (par_amg_solve.c: >>> 252) >>> ==2605== by 0x8AE4A25: HYPRE_BoomerAMGSolve >>> (HYPRE_parcsr_amg.c:76) >>> ==2605== by 0x855AAA4: PCApply_HYPRE (hypre.c:172) >>> ==2605== by 0x862074E: PCApply (precon.c:357) >>> ==2605== by 0x8606095: KSPSolve_PREONLY (preonly.c:29) >>> ==2605== by 0x85A85D3: KSPSolve (itfunc.c:385) >>> ==2605== by 0x80F5B16: applyPrecSchur(void*, _p_Vec*, _p_Vec*) >>> (NS3T10mpi.cxx:3741) >>> ==2605== by 0x851C47E: PCApply_Shell (shellpc.c:129) >>> ==2605== by 0x862074E: PCApply (precon.c:357) >>> ==2605== Address 0xcded820 is 0 bytes after a block of size >>> 93,488 >>> alloc'd >>> ==2605== at 0x4023F5B: calloc (vg_replace_malloc.c:418) >>> ==2605== by 0x8B4E9C7: hypre_CAlloc (hypre_memory.c:121) >>> ==2605== by 0x8B4CA67: hypre_CSRMatrixInitialize >>> (csr_matrix.c:91) >>> ==2605== by 0x8B32EC8: hypre_ParCSRMatrixInitialize >>> (par_csr_matrix.c:200) >>> ==2605== by 0x8AE0C44: hypre_IJMatrixInitializeParCSR >>> (IJMatrix_parcsr.c:272) >>> ==2605== by 0x8ADBE09: HYPRE_IJMatrixInitialize >>> (HYPRE_IJMatrix.c:302) >>> ==2605== by 0x891AD3A: MatHYPRE_IJMatrixFastCopy_SeqAIJ >>> (mhyp.c:174) >>> ==2605== by 0x891A2E1: MatHYPRE_IJMatrixCopy (mhyp.c:131) >>> ==2605== by 0x855A445: PCSetUp_HYPRE (hypre.c:130) >>> ==2605== by 0x86256A9: PCSetUp (precon.c:794) >>> ==2605== by 0x85A6E62: KSPSetUp (itfunc.c:237) >>> ==2605== by 0x85A7EAB: KSPSolve (itfunc.c:353) >>> ==2605== >>> Matthew Knepley wrote: >>> Try valgrind. >>> Matt >>> On Sat, Nov 14, 2009 at 3:32 PM, Dominik Szczerba >>> <dominik at itis.ethz.ch <mailto:dominik at itis.ethz.ch> >>> <mailto:dominik at itis.ethz.ch <mailto:dominik at itis.ethz.ch>>> >>> wrote: >>> Now for something more serious: I get a crash like this >>> one: >>> Starting KSPSolve (1/2) >>> 0 KSP Residual norm 2.964538623545e-06 >>> *** glibc detected *** /home/domel/build/solve-debug/ >>> ns3t10mpi: >>> malloc(): memory corruption: 0x09258008 *** >>> ======= Backtrace: ========= >>> /lib/tls/i686/cmov/libc.so.6[0x5f9ff1] >>> /lib/tls/i686/cmov/libc.so.6[0x5fcbb3] >>> /lib/tls/i686/cmov/libc.so.6(__libc_calloc+0xa9)[0x5fe009] >>> /home/domel/build/solve-debug/ >>> ns3t10mpi(hypre_CAlloc+0x2c)[0x8b4ea28] >>> /home/domel/build/solve-debug/ >>> ns3t10mpi(hypre_BoomerAMGCoarsenRuge+0xb5)[0x8af2c7b] >>> (and so on) >>> gdb invoked as: >>> mpiexec -np 2 ..... -on_error_attach_debugger -display :0.0 >>> does not display any backtrace after the crash. >>> Any hints how to debug are highly appreciated. >>> Dominik >>> -- What most experimenters take for granted before >>> they begin their >>> experiments is infinitely more interesting than any results to >>> which their experiments lead. >>> -- Norbert Wiener >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which their experiments lead. >>> -- Norbert Wiener >
