Mat Object: 1 MPI processes type: mpiaij row 0: (0, 0.) (1, 0.486111) row 1: (0, 0.486111) (1, 0.) row 2: (2, 0.) (3, 0.486111) row 3: (4, 0.486111) (5, -0.486111) row 4: row 5:
The matrix created is funny (empty rows at the end) - so perhaps its exposing bugs in Mat code? [is that a valid matrix for this code?] ==21091== Use of uninitialised value of size 8 ==21091== at 0x57CA16B: MatGetRowIJ_SeqAIJ_Inode_Symmetric (inode.c:101) ==21091== by 0x57CBA1C: MatGetRowIJ_SeqAIJ_Inode (inode.c:241) ==21091== by 0x537C0B5: MatGetRowIJ (matrix.c:7274) ==21091== by 0x53072FD: MatGetOrdering_ND (spnd.c:18) ==21091== by 0x530BC39: MatGetOrdering (sorder.c:260) ==21091== by 0x530A72D: MatGetOrdering (sorder.c:202) ==21091== by 0x5DDD764: PCSetUp_LU (lu.c:124) ==21091== by 0x5EBFE60: PCSetUp (precon.c:968) ==21091== by 0x5FDA1B3: KSPSetUp (itfunc.c:390) ==21091== by 0x601C17D: kspsetup_ (itfuncf.c:252) ==21091== by 0x4028B9: MAIN__ (ex1f.F90:104) ==21091== by 0x403535: main (ex1f.F90:185) This goes away if I add: call PCFactorSetMatOrderingType(pc,MATORDERINGNATURAL,ierr) And then there is also: ==21275== Invalid read of size 8 ==21275== at 0x584DE93: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:4734) ==21275== by 0x58970A8: MatMatMultSymbolic_MPIAIJ_MPIAIJ_nonscalable (mpimatmatmult.c:198) ==21275== by 0x5894A54: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:34) ==21275== by 0x539664E: MatMatMult (matrix.c:9510) ==21275== by 0x53B3201: matmatmult_ (matrixf.c:1157) ==21275== by 0x402FC9: MAIN__ (ex1f.F90:149) ==21275== by 0x4035B9: main (ex1f.F90:186) ==21275== Address 0xa3d20f0 is 0 bytes after a block of size 48 alloc'd ==21275== at 0x4C2DF93: memalign (vg_replace_malloc.c:858) ==21275== by 0x4FDE05E: PetscMallocAlign (mal.c:28) ==21275== by 0x5240240: VecScatterCreate (vscat.c:1220) ==21275== by 0x5857708: MatSetUpMultiply_MPIAIJ (mmaij.c:116) ==21275== by 0x581C31E: MatAssemblyEnd_MPIAIJ (mpiaij.c:747) ==21275== by 0x53680F2: MatAssemblyEnd (matrix.c:5187) ==21275== by 0x53B24D2: matassemblyend_ (matrixf.c:926) ==21275== by 0x40262C: MAIN__ (ex1f.F90:60) ==21275== by 0x4035B9: main (ex1f.F90:186) Satish ----------- $ diff build_nullbasis_petsc_mumps.F90 ex1f.F90 3,7c3 < #include <petsc/finclude/petscsys.h> < #include "petsc/finclude/petscvec.h" < #include "petsc/finclude/petscmat.h" < #include "petsc/finclude/petscpc.h" < #include "petsc/finclude/petscksp.h" --- > #include "petsc/finclude/petsc.h" 40,41c36,37 < call PetscViewerBinaryOpen(PETSC_COMM_WORLD, "mat_c_bin.txt", 0, viewer, ierr) < call MatLoad(mat_c, viewer) --- > call PetscViewerBinaryOpen(PETSC_COMM_WORLD, "mat_c_bin.txt", > FILE_MODE_READ, viewer, ierr) > call MatLoad(mat_c, viewer,ierr) 75a72 > call PCFactorSetMatOrderingType(pc,MATORDERINGNATURAL,ierr) 150c147 < call MatConvert(x, MATMPIAIJ, MAT_REUSE_MATRIX, x, ierr) --- > call MatConvert(x, MATMPIAIJ, MAT_INPLACE_MATRIX, x, ierr) On Thu, 26 May 2016, Matthew Knepley wrote: > Usually this means you have an uninitialized variable that is causing you > to overwrite memory. Fortran > is so lax in checking this, its one reason to switch to C. > > Thanks, > > Matt > > On Thu, May 26, 2016 at 1:46 AM, Constantin Nguyen Van < > [email protected]> wrote: > > > Thanks for all your answers. > > I'm sorry for the syntax mistake in MatLoad, it was done afterwards. > > > > I recompile PETSC --with-debugging=yes and run my code again. > > Now, I also have this strange behaviour. When I run the code without > > valgrind and with one proc, I have this error message: > > > > BEGIN PROC 0 > > ITERATION 1 > > ECHO 1 > > ECHO 2 > > INFOG(28): 2 > > BASIS OK 0 > > END PROC 0 > > BEGIN PROC 0 > > ITERATION 2 > > ECHO 1 > > ECHO 2 > > INFOG(28): 2 > > BASIS OK 0 > > END PROC 0 > > BEGIN PROC 0 > > ITERATION 3 > > ECHO 1 > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > > X to find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > [0]PETSC ERROR: --------------------- Stack Frames > > ------------------------------------ > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > > available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > [0]PETSC ERROR: is given. > > [0]PETSC ERROR: [0] MatGetRowIJ_SeqAIJ_Inode_Symmetric line 69 > > /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/impls/aij/seq/inode.c > > [0]PETSC ERROR: [0] MatGetRowIJ_SeqAIJ_Inode line 235 > > /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/impls/aij/seq/inode.c > > [0]PETSC ERROR: [0] MatGetRowIJ line 7099 > > /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/interface/matrix.c > > [0]PETSC ERROR: [0] MatGetOrdering_ND line 17 > > /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/order/spnd.c > > [0]PETSC ERROR: [0] MatGetOrdering line 185 > > /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/order/sorder.c > > [0]PETSC ERROR: [0] MatGetOrdering line 185 > > /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/order/sorder.c > > [0]PETSC ERROR: [0] PCSetUp_LU line 99 > > /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/ksp/pc/impls/factor/lu/lu.c > > [0]PETSC ERROR: [0] PCSetUp line 945 > > /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/ksp/pc/interface/precon.c > > [0]PETSC ERROR: [0] KSPSetUp line 247 > > /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/ksp/ksp/interface/itfunc.c > > > > But when I run it with valgrind, it does work well. > > > > Le 2016-05-25 20:04, Barry Smith a écrit : > > > >> First run with valgrind > >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> > >> On May 25, 2016, at 2:35 AM, Constantin Nguyen Van > >>> <[email protected]> wrote: > >>> > >>> Hi, > >>> > >>> I'm a new user of PETSc and I try to use it with MUMPS > >>> functionalities to compute a nullbasis. > >>> I wrote a code where I compute 4 times the same nullbasis. It does > >>> work well when I run it with several procs but with only one > >>> processor I get an error on the 2nd iteration when KSPSetUp is > >>> called. Furthermore when it is run with a debugger ( > >>> --with-debugging=yes), it works fine with one or several processors. > >>> Have you got any idea about why it doesn't work with one processor > >>> and no debugger? > >>> > >>> Thanks. > >>> Constantin. > >>> > >>> PS: You can find the code and the files required to run it enclosed. > >>> > >> > > >
