matsetsizes() is used practically everywhere - so I can't believe there is a bug there. [the fortran interface code autogenerated by bfort].
Do you see the error with the 'first' call to this routine - or some subsequent loop? Perhaps you can try using valgrind equivalent on windows to see if there is memory corruption somewhere else. http://stackoverflow.com/questions/413477/is-there-a-good-valgrind-substitute-for-windows suggests using http://code.google.com/p/drmemory/ Satish On Thu, 21 Nov 2013, Einar Sørheim wrote: > I am on the windows platform so unfortunatley no valgrind, however have > been able to track down the problem in the debugger and it seems like a > matrixpointer gets corrupted when using the fortran-c interface in petsc. > At some point in the calculation the matrix pointer in MatSetSize gets > corrupted when called from fortran, propably in this cast: > > void PETSC_STDCALL matsetsizes_(Mat A,PetscInt *m,PetscInt *n,PetscInt > *M,PetscInt *N, int *__ierr ){ > *__ierr = MatSetSizes( > (Mat)PetscToPointer((A) ),*m,*n,*M,*N); > } > when entering the MatSetSize in the debugger the A-matrix pointer is > corrupt and I get an access violation on code line: > PetscValidHeaderSpecific(A,MAT_CLASSID,1); > Any ideas what may be the cause of this? > Regards > Einar > > > 2013/10/24 Barry Smith <[email protected]> > > > > > So apparently no memory leak. > > > > If you are still getting crashes you will need to run the case that > > crashes 1) with valgrind if you haven’t or 2) using > > -on_error_attach_debugger or -start_in_debugger to track down the cause of > > the crash. > > > > Barry > > > > > > > > On Oct 24, 2013, at 2:09 AM, Einar Sørheim <[email protected]> > > wrote: > > > > > Well i I did a rerun without -malloc_log, this is what I got (I have > > also attached the log-sumary file): > > > > > > [0] PetscFinalize(): PetscFinalize() called > > > [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 > > -2080374784 max tags = 2147483647 > > > Summary of Memory Usage in PETSc > > > [0]Current space PetscMalloc()ed 45936, max space PetscMalloced() > > 1.13728e+009 > > > [0]OS cannot compute process memory > > > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 > > -2080374784 > > > [0] PetscGetHostName(): Rejecting domainname, likely is NIS CMP3. > > > [0] PetscGetHostName(): Rejecting domainname, likely is NIS CMP3. > > > [0] PetscFOpen(): Opening file Log.0 > > > [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm > > -2080374784 > > > [0] Petsc_DelComm_Inner(): Removing reference to PETSc communicator > > embedded in a user MPI_Comm -2080374784 > > > [0] Petsc_DelComm_Outer(): User MPI_Comm 1140850688 is being freed after > > removing reference from inner PETSc comm to this outer comm > > > [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 > > > [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm > > -2080374784 > > > [0] Petsc_DelThreadComm(): Deleting thread communicator data in an > > MPI_Comm -2080374784 > > > [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784 > > > > > > > > > > > > 2013/10/23 Barry Smith <[email protected]> > > > > > > Don’t run with the -malloc_log Mark did not suggest using that option. > > > > > > > > > On Oct 23, 2013, at 8:48 AM, Einar Sørheim <[email protected]> > > wrote: > > > > > > > I tried running with the following options : > > > > -info > > > > -malloc > > > > -malloc_log > > > > -malloc_dump > > > > -malloc_debug > > > > -memory_info > > > > -log_summary log_summary.txt > > > > -log > > > > but didn't get a lot wiser however in my case would it be more > > efficient to keep the ksp solver object, if so should I have one object eg. > > for the thermal problem and another one for the mechanical? > > > > The output from the end of the run: > > > > Total wall clock time actually solving: 856.231474 > > > > [0] PetscFinalize(): PetscFinalize() called > > > > [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 > > -2080374784 max tags = 2147483647 > > > > Summary of Memory Usage in PETSc > > > > [0]Current space PetscMalloc()ed 45936, max space PetscMalloced() > > 1.13728e+009 > > > > [0]OS cannot compute process memory > > > > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 > > -2080374784 > > > > [0] PetscGetHostName(): Rejecting domainname, likely is NIS CMP3. > > > > [0] PetscGetHostName(): Rejecting domainname, likely is NIS CMP3. > > > > [0] PetscFOpen(): Opening file Log.0 > > > > [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm > > -2080374784 > > > > [0] Petsc_DelComm_Inner(): Removing reference to PETSc communicator > > embedded in a user MPI_Comm -2080374784 > > > > [0] Petsc_DelComm_Outer(): User MPI_Comm 1140850688 is being freed > > after removing reference from inner PETSc comm to this outer comm > > > > [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784 > > > > [0] Petsc_DelViewer(): Removing viewer data attribute in an MPI_Comm > > -2080374784 > > > > [0] Petsc_DelThreadComm(): Deleting thread communicator data in an > > MPI_Comm -2080374784 > > > > [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm > > -2080374784 > > > > [0]PETSC ERROR: --------------------- Error Message > > ------------------------------------ > > > > [0]PETSC ERROR: Object is in wrong state! > > > > [0]PETSC ERROR: PetscMallocDumpLog() called without call to > > PetscMallocSetDumpLog() this is often due to > > > > setting the option -malloc_log AFTER > > PetscInitialize() with PetscOptionsInsert() or PetscOptionsInsertFile()! > > > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013 > > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: D:\Programming\gits\Alsim\SharedModules - > > Copy\Source\Release\Alsim_nypetsc.exe on a arch-mswin-c-debug named CMP3 by > > admeinar Wed Oct 23 11:54:03 2013 > > > > [0]PETSC ERROR: Libraries linked from > > /cygdrive/d/Programming/petsc-3.4.3/arch-mswin-c-debug/lib > > > > [0]PETSC ERROR: Configure run at Thu Oct 17 08:23:18 2013 > > > > [0]PETSC ERROR: Configure options --with-debugging=1 --with-openmp=yes > > --with-pthread=no --with-cc="win32fe icl" --with-fc="win32fe ifort" > > --with-cxx="win32fe icl" --with-blas-lapack-dir="C:/Program Files > > (x86)/Intel/Composer XE 2013/mkl/lib/intel64" --download-superlu > > --with-sowing=0 --with-c2html=0 --with-mpi-dir="C:/Program Files > > (x86)/Intel/MPI/4.1.0.028/em64t" --useThreads=0 --useThreads=0 > > > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: PetscMallocDumpLog() line 652 in > > src/sys/memory/D:\PROGRA~1\PETSC-~1.3\src\sys\memory\mtr.c > > > > [0]PETSC ERROR: PetscFinalize() line 1185 in > > src/sys/objects/D:\PROGRA~1\PETSC-~1.3\src\sys\objects\pinit.c > > > > > > > > > > > > 2013/10/19 Mark F. Adams <[email protected]> > > > > You can try running a problem that finishes with > > > > > > > > -malloc > > > > -malloc_debug > > > > -malloc_dump > > > > > > > > to see if there are any memory issues. It looks like you are > > completely destroying the solver object each time and starting over. > > Valgrind would be very useful here because it looks like memory is getting > > corrupted if the same code runs many times. > > > > > > > > > > > > On Oct 19, 2013, at 3:39 PM, Einar Sørheim <[email protected]> > > wrote: > > > > > > > >> yes, I call the given code many times, what i have seen so far is > > that changing e.g. the order of destruction statements changes the time of > > failure,I will try to debug to get more details > > > >> > > > >> > > > >> 2013/10/18 Mark F. Adams <[email protected]> > > > >> I would guess there is some memory corruption or leak someplace. Try > > running with -on_error_attach_debugger, and try to get some more > > information. Do you just repeat a call to the code below many times? Is > > this deterministic, does it fail on the same solve each time? > > > >> > > > >> > > > >> On Oct 18, 2013, at 9:30 AM, Einar Sørheim <[email protected]> > > wrote: > > > >> > > > >>> Uncommenting KSPSetFromOptions solved my intial problem, however I > > got a new problem that occurs after many timesteps, in our code we use > > petsc gamg as a solver for several problems(e.g. thermal and mechanical) at > > each time step, I have a suspicion it might have something to do with > > cleanup after solving, the rror appears after many calls to the petsc gamg > > solver interface, the error message is the following: > > > >>> > > > >>> [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > > >>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > > Violation, probably memory access out of range > > > >>> [0]PETSC ERROR: Try option -start_in_debugger or > > -on_error_attach_debugger > > > >>> [0]PETSC ERROR: or see > > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSCERROR: > > or try > > http://valgrind.org on GNU/linux and Apple Mac OS X to find memory > > corruption errors > > > >>> [0]PETSC ERROR: likely location of problem given in stack below > > > >>> [0]PETSC ERROR: --------------------- Stack Frames > > ------------------------------------ > > > >>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > > available, > > > >>> [0]PETSC ERROR: INSTEAD the line number of the start of the > > function > > > >>> [0]PETSC ERROR: is given. > > > >>> [0]PETSC ERROR: [0] MatSetNearNullSpace line 7580 > > src/mat/interface/D:\PROGRA~1\PETSC-~1.3\src\mat\INTERF~1\matrix.c > > > >>> [0]PETSC ERROR: --------------------- Error Message > > ------------------------------------ > > > >>> [0]PETSC ERROR: Signal received! > > > >>> [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > > >>> [0]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013 > > > >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > >>> [0]PETSC ERROR: See docs/index.html for manual pages. > > > >>> [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > > >>> [0]PETSC ERROR: D:\Programming\gits\Alsim\SharedModules - > > Copy\Source\Release\Alsim_nypetsc.exe on a arch-mswin-c-debug named CMP3 by > > admeinar Fri Oct 18 12:26:04 2013 > > > >>> [0]PETSC ERROR: Libraries linked from > > /cygdrive/d/Programming/petsc-3.4.3/arch-mswin-c-debug/lib > > > >>> [0]PETSC ERROR: Configure run at Thu Oct 17 08:23:18 2013 > > > >>> [0]PETSC ERROR: Configure options --with-debugging=1 > > --with-openmp=yes --with-pthread=no --with-cc="win32fe icl" > > --with-fc="win32fe ifort" --with-cxx="win32fe icl" > > --with-blas-lapack-dir="C:/Program Files (x86)/Intel/Composer XE > > 2013/mkl/lib/intel64" --download-superlu --with-sowing=0 --with-c2html=0 > > --with-mpi-dir="C:/Program Files (x86)/Intel/MPI/4.1.0.028/em64t" > > --useThreads=0 --useThreads=0 > > > >>> > > > >>> The source code for the petsc interface: > > > >>> > > > >>> subroutine > > petscsolver(i_nodes,r_elements_,i_rowstart_,i_columnindex_,r_x_,r_b_,i_Nnz,i_Precond, > > i_Solver > > > >>> $ ,coords_,i_nodedof) > > > >>> #include <finclude/petscsys.h> > > > >>> #include <finclude/petscvec.h> > > > >>> #include <finclude/petscmat.h> > > > >>> #include <finclude/petscpc.h> > > > >>> #include <finclude/petscksp.h> > > > >>> #include <finclude/petscviewer.h> > > > >>> > > > >>> integer i_nodes, i_Nnz, i_Precond, i_solver, i_nodedof > > > >>> integer i_rowstart_(*),i_columnindex_(*) > > > >>> double precision r_elements_(*),r_x_(*), r_b_(*),coords_(*) > > > >>> integer, allocatable :: i_indices_(:) > > > >>> ! Variables: > > > >>> ! > > > >>> ! A - matrix that defines linear system > > > >>> ! ksp - KSP context > > > >>> ! ksp - KSP context > > > >>> ! x, b, u - approx solution, RHS, exact solution vectors > > > >>> ! > > > >>> ! implicit none > > > >>> Vec x,u,b,vec_coords > > > >>> PC pc > > > >>> Mat A,F,Matnull > > > >>> KSP ksp > > > >>> PetscInt i,j,II,JJ,m,n,i1,imaxiter > > > >>> PetscInt Istart,Iend > > > >>> PetscInt nsteps,one, neg_one > > > >>> PetscErrorCode ierr > > > >>> PetscBool flg > > > >>> PetscScalar v,x_array_(1), norm, tol, t0,t1 > > > >>> PetscOffset i_x > > > >>> PetscViewer viewer > > > >>> > > > >>> > > > >>> call PetscTime(t0,ierr) > > > >>> write(*,*) "Enter petsc" > > > >>> > > > >>> n = 1 > > > >>> nsteps = 1 > > > >>> one = 1 > > > >>> neg_one = 1 > > > >>> i1 = 1 > > > >>> imaxiter = 1200 > > > >>> > > > >>> > > > >>> call MatCreate(PETSC_COMM_WORLD,A,ierr) > > > >>> call > > MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,i_nodes,i_nodes,ierr) > > > >>> call MatSetType(A, MATSEQAIJ,ierr) > > > >>> call MatSetUp(A,ierr) > > > >>> call MatSeqAIJSetPreallocation(A,100,PETSC_NULL_INTEGER,ierr) > > > >>> ! The matrix is partitioned by contiguous chunks of rows across the > > > >>> ! processors. Determine which rows of the matrix are locally owned. > > > >>> > > > >>> call MatGetOwnershipRange(A,Istart,Iend,ierr) > > > >>> > > > >>> > > > >>> write(*,*) "Start of matrix fill", i_nodes, Istart, Iend > > > >>> do i=1,i_nodes > > > >>> ! write(*,*) "Start of loop",i !!, i_indices_(i+1) > > > >>> ! i_indices_(i+1) = i > > > >>> ! write(*,*) "Start of loop2" > > > >>> do j=i_rowstart_(i),i_rowstart_(i+1)-1 > > > >>> ! write(*,*) "Start of loop 3" > > > >>> v = r_elements_(j) > > > >>> ! write(*,*) i1,i,i1,j-1,v > > > >>> call > > MatSetValues(A,i1,i-1,i1,i_columnindex_(j)-1,v,INSERT_VALUES,ierr) > > > >>> if (ierr.gt.0) stop > > > >>> end do > > > >>> end do > > > >>> ! write(*,*) "End of matrix fill" > > > >>> > > > >>> ! Assemble matrix, using the 2-step process: > > > >>> ! MatAssemblyBegin(), MatAssemblyEnd() > > > >>> ! Computations can be done while messages are in transition > > > >>> ! by placing code between these two statements. > > > >>> > > > >>> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr) > > > >>> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr) > > > >>> > > > >>> ! Create parallel vectors. > > > >>> ! - When using VecCreate(), the parallel partitioning of the vector > > > >>> ! is determined by PETSc at runtime. > > > >>> ! - Note: We form 1 vector from scratch and then duplicate as > > needed. > > > >>> ! write(*,*) "Start of vector fill" > > > >>> call VecCreate(PETSC_COMM_WORLD,u,ierr) > > > >>> call VecSetType(u, VECSEQ,ierr) > > > >>> call VecSetSizes(u,PETSC_DECIDE,i_nodes,ierr) > > > >>> ! call VecSetFromOptions(u,ierr) > > > >>> call VecDuplicate(u,b,ierr) > > > >>> call VecDuplicate(b,x,ierr) > > > >>> do i=1,i_nodes > > > >>> call VecSetValues(x,i1,i-1,r_x_(i),INSERT_VALUES,ierr) > > > >>> call VecSetValues(b,i1,i-1,r_b_(i),INSERT_VALUES,ierr) > > > >>> enddo > > > >>> ! Assemble vector > > > >>> > > > >>> call VecAssemblyBegin(x,ierr) > > > >>> call VecAssemblyEnd(x,ierr) > > > >>> call VecAssemblyBegin(b,ierr) > > > >>> call VecAssemblyEnd(b,ierr) > > > >>> call PetscTime(t1,ierr) > > > >>> ! write(*,*) "End of vec fill time spent :", t1-t0 > > > >>> ! Create linear solver context > > > >>> > > > >>> call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) > > > >>> > > > >>> > > > >>> ! Set operators. Here the matrix that defines the linear system > > > >>> ! also serves as the preconditioning matrix. > > > >>> > > > >>> call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) > > > >>> > > > >>> > > > >>> call KSPGetPC(ksp,pc,ierr) > > > >>> tol = 1.e-20 > > > >>> call > > KSPSetTolerances(ksp,tol,PETSC_DEFAULT_DOUBLE_PRECISION,PETSC_DEFAULT_DOUBLE_PRECISION,imaxiter,ierr) > > > >>> select case (i_Precond) > > > >>> case ( 1 ) > > > >>> call PCSetType(pc,PCJACOBI,ierr) > > > >>> case ( 2 ) > > > >>> call PCSetType(pc,PCILU,ierr) > > > >>> call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU,ierr); > > > >>> call PCFactorSetUpMatSolverPackage(pc,ierr); > > > >>> call PCFactorGetMatrix(pc,F); > > > >>> call MatSuperluSetILUDropTol(F,1.e-4); > > > >>> case ( 3 ) > > > >>> call PCSetType(pc,PCGAMG,ierr) > > > >>> ! call PCGAMGInitializePackage() > > > >>> ! call PCCreate_GAMG(pc,ierr) > > > >>> case DEFAULT > > > >>> call PCSetType(pc,PCJACOBI,ierr) > > > >>> end select > > > >>> > > > >>> ! call PCSetType(pc,PCJACOBI,ierr) > > > >>> select case (i_solver) > > > >>> case ( 0 ) > > > >>> call KSPSetType(ksp,KSPCG,ierr) > > > >>> case DEFAULT > > > >>> call KSPSetType(ksp,KSPCG,ierr) > > > >>> end select > > > >>> call KSPCGSetType(ksp,KSP_CG_SYMMETRIC,ierr) > > > >>> if (i_nodedof==3) then > > > >>> write(*,*) "Set coord, number of nodes is:", > > i_nodes/i_nodedof > > > >>> call > > VecCreateSeqWithArray(MPI_COMM_SELF,3,i_nodes,coords_,vec_coords,ierr) > > > >>> call MatNullSpaceCreateRigidBody(vec_coords,Matnull,ierr) > > > >>> call MatSetNearNullSpace(A,Matnull,ierr) > > > >>> call MatNullSpaceDestroy(Matnull,ierr) > > > >>> > > > >>> write(*,*) "Finish setting null space ierr =", ierr > > > >>> ! call PCSetCoordinates( pc, 3,i_nodes/i_nodedof, coords_, > > ierr ) > > > >>> > > > >>> end if > > > >>> ! call KSPMonitorSet(ksp, KSPMonitorDefault, PETSC_NULL, > > PETSC_NULL,ierr) > > > >>> ! Set runtime options, e.g., > > > >>> ! -ksp_type <type> -pc_type <type> -ksp_monitor -ksp_rtol <rtol> > > > >>> ! These options will override those specified above as long as > > > >>> ! KSPSetFromOptions() is called _after_ any other customization > > > >>> ! routines. > > > >>> > > > >>> call KSPSetFromOptions(ksp,ierr) > > > >>> call KSPSetInitialGuessNonzero(ksp,PETSC_TRUE,ierr) > > > >>> call KSPSetUp(ksp,ierr) > > > >>> call PetscTime(t0,ierr) > > > >>> write(*,*) "Time setup", t0-t1 > > > >>> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > - - > > > >>> ! Solve the linear system > > > >>> ! - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > - - > > > >>> > > > >>> call KSPSolve(ksp,b,x,ierr) > > > >>> call PetscTime(t1,ierr) > > > >>> write(*,*) "end solve, time used:",t1-t0 > > > >>> call KSPGetSolution(ksp,x,ierr) > > > >>> call VecGetArray(x,x_array_,i_x,ierr) > > > >>> do i=1,i_nodes > > > >>> r_x_(i) = x_array_(i_x + i) > > > >>> enddo > > > >>> call VecRestoreArray(x,x_array_,i_x,ierr) > > > >>> > > > >>> > > > >>> ! Free work space. All PETSc objects should be destroyed when they > > > >>> ! are no longer needed. > > > >>> ! call PCDestroy(pc,ierr) > > > >>> > > > >>> call VecDestroy(u,ierr) > > > >>> call VecDestroy(x,ierr) > > > >>> call VecDestroy(b,ierr) > > > >>> call MatDestroy(A,ierr) > > > >>> if (i_nodedof==3) then > > > >>> call VecDestroy(vec_coords,ierr) > > > >>> call MatDestroy(Matnull,ierr) > > > >>> endif > > > >>> call KSPDestroy(ksp,ierr) > > > >>> call PetscTime(t0,ierr) > > > >>> write(*,*) "time celan up :", t0-t1 > > > >>> > > > >>> end > > > >>> > > > >>> -- > > > >>> Einar Sørheim > > > >>> <Mail Attachment> > > > >> > > > >> > > > >> > > > >> > > > >> -- > > > >> Einar Sørheim > > > > > > > > > > > > > > > > > > > > -- > > > > Einar Sørheim > > > > > > > > > > > > > > > -- > > > Einar Sørheim > > > <log_summary.txt> > > > > > > >
