On Thu, Oct 17, 2013 at 3:00 PM, Jed Brown <[email protected]> wrote:
> Bishesh Khanal <[email protected]> writes: > > The program crashes only for a bigger domain size. Even in the cluster, > it > > does not crash for the domain size up to a certain size. So I need to > run > > in the debugger for the case when it crashes to get the stack trace from > > the SEGV, right ? I do not know how to attach a debugger when submitting > a > > job to the cluster if that is possible at all! > > Most machines allow you to get "interactive" sessions. You can usually > run debuggers within those. Some facilities also have commercial > debuggers. > Thanks, I'll have a look at that. > > > Or are you asking me to run the program in the debugger in my laptop > > for the biggest size ? (I have not tried running the code for the > > biggest size in my laptop fearing it might take forever) > > Your laptop probably doesn't have enough memory for that. > Yes, I tried it just a while ago and this is happened I think. (Just to confirm, I have put the error message for this case at the very end of this reply.*) > > Can you try running on the cluster with one MPI rank per node? We > should rule out simple out-of-memory problems, confirm that the code > executes correctly with MPICH, and finally figure out why it fails with > Open MPI (assuming that the previous hunch was correct). > > I'm sorry but I'm a complete beginner with MPI and clusters; so what does one MPI rank per node means and what should I do to do that ? My guess is that I set one core per node and use multiple nodes in my job script file ? Or do I need to do something in the petsc code ? *Here is the error I get when running for the full domain size in my laptop: [3]PETSC ERROR: --------------------- Error Message ------------------------------------ [3]PETSC ERROR: Out of memory. This could be due to allocating [3]PETSC ERROR: too large an object or bleeding by not properly [3]PETSC ERROR: destroying unneeded objects. [1]PETSC ERROR: Memory allocated 0 Memory used by process 1700159488 [1]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. [1]PETSC ERROR: Memory requested 6234924800! [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013 [1]PETSC ERROR: See docs/changes/index.html for recent updates. [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [1]PETSC ERROR: See docs/index.html for manual pages. [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: [2]PETSC ERROR: Memory allocated 0 Memory used by process 1695793152 [2]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. [2]PETSC ERROR: Memory requested 6223582208! [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013 [2]PETSC ERROR: See docs/changes/index.html for recent updates. [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [2]PETSC ERROR: See docs/index.html for manual pages. [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: src/AdLemMain on a arch-linux2-cxx-debug named edwards by bkhanal Thu Oct 17 15:19:22 2013 [1]PETSC ERROR: Libraries linked from /home/bkhanal/Documents/softwares/petsc-3.4.3/arch-linux2-cxx-debug/lib [1]PETSC ERROR: Configure run at Wed Oct 16 15:13:05 2013 [1]PETSC ERROR: Configure options --download-mpich -download-f-blas-lapack=1 --download-metis --download-parmetis --download-superlu_dist --download-scalapack --download-mumps --download-hypre --with-clanguage=cxx [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: PetscMallocAlign() line 46 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/sys/memory/mal.c src/AdLemMain on a arch-linux2-cxx-debug named edwards by bkhanal Thu Oct 17 15:19:22 2013 [2]PETSC ERROR: Libraries linked from /home/bkhanal/Documents/softwares/petsc-3.4.3/arch-linux2-cxx-debug/lib [2]PETSC ERROR: Configure run at Wed Oct 16 15:13:05 2013 [2]PETSC ERROR: Configure options --download-mpich -download-f-blas-lapack=1 --download-metis --download-parmetis --download-superlu_dist --download-scalapack --download-mumps --download-hypre --with-clanguage=cxx [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: PetscMallocAlign() line 46 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/sys/memory/mal.c [1]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 3551 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/seq/aij.c [1]PETSC ERROR: MatSeqAIJSetPreallocation() line 3496 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/seq/aij.c [2]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 3551 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/seq/aij.c [2]PETSC ERROR: MatSeqAIJSetPreallocation() line 3496 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/seq/aij.c [1]PETSC ERROR: MatMPIAIJSetPreallocation_MPIAIJ() line 3307 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/mpi/mpiaij.c [1]PETSC ERROR: MatMPIAIJSetPreallocation() line 4015 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/mpi/mpiaij.c [2]PETSC ERROR: MatMPIAIJSetPreallocation_MPIAIJ() line 3307 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/mpi/mpiaij.c [2]PETSC ERROR: MatMPIAIJSetPreallocation() line 4015 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Out of memory. This could be due to allocating [0]PETSC ERROR: too large an object or bleeding by not properly [0]PETSC ERROR: destroying unneeded objects. [2]PETSC ERROR: [1]PETSC ERROR: DMCreateMatrix_DA_3d_MPIAIJ() line 1101 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/impls/da/fdda.c [1]PETSC ERROR: DMCreateMatrix_DA_3d_MPIAIJ() line 1101 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/impls/da/fdda.c [2]PETSC ERROR: DMCreateMatrix_DA() line 771 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/impls/da/fdda.c DMCreateMatrix_DA() line 771 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/impls/da/fdda.c [3]PETSC ERROR: Memory allocated 0 Memory used by process 1675407360 [3]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. [3]PETSC ERROR: Memory requested 6166659200! [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013 [3]PETSC ERROR: See docs/changes/index.html for recent updates. [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [3]PETSC ERROR: See docs/index.html for manual pages. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: src/AdLemMain on a arch-linux2-cxx-debug named edwards by bkhanal Thu Oct 17 15:19:22 2013 [3]PETSC ERROR: Libraries linked from /home/bkhanal/Documents/softwares/petsc-3.4.3/arch-linux2-cxx-debug/lib [3]PETSC ERROR: Configure run at Wed Oct 16 15:13:05 2013 [3]PETSC ERROR: Configure options --download-mpich -download-f-blas-lapack=1 --download-metis --download-parmetis --download-superlu_dist --download-scalapack --download-mumps --download-hypre --with-clanguage=cxx [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: [1]PETSC ERROR: DMCreateMatrix() line 910 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/interface/dm.c [2]PETSC ERROR: DMCreateMatrix() line 910 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/interface/dm.c PetscMallocAlign() line 46 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/sys/memory/mal.c [3]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 3551 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/seq/aij.c [3]PETSC ERROR: MatSeqAIJSetPreallocation() line 3496 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/seq/aij.c [1]PETSC ERROR: KSPSetUp() line 207 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/ksp/ksp/interface/itfunc.c [2]PETSC ERROR: KSPSetUp() line 207 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/ksp/ksp/interface/itfunc.c [3]PETSC ERROR: MatMPIAIJSetPreallocation_MPIAIJ() line 3307 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/mpi/mpiaij.c [3]PETSC ERROR: MatMPIAIJSetPreallocation() line 4015 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/mpi/mpiaij.c [3]PETSC ERROR: DMCreateMatrix_DA_3d_MPIAIJ() line 1101 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/impls/da/fdda.c [3]PETSC ERROR: DMCreateMatrix_DA() line 771 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/impls/da/fdda.c [3]PETSC ERROR: DMCreateMatrix() line 910 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/interface/dm.c [3]PETSC ERROR: KSPSetUp() line 207 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/ksp/ksp/interface/itfunc.c [1]PETSC ERROR: solveModel() line 128 in "unknowndirectory/"/user/bkhanal/home/works/AdLemModel/src/PetscAdLemTaras3D.cxx [2]PETSC ERROR: solveModel() line 128 in "unknowndirectory/"/user/bkhanal/home/works/AdLemModel/src/PetscAdLemTaras3D.cxx [3]PETSC ERROR: solveModel() line 128 in "unknowndirectory/"/user/bkhanal/home/works/AdLemModel/src/PetscAdLemTaras3D.cxx [0]PETSC ERROR: Memory allocated 0 Memory used by process 1711476736 [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. [0]PETSC ERROR: Memory requested 6292477952! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: src/AdLemMain on a arch-linux2-cxx-debug named edwards by bkhanal Thu Oct 17 15:19:22 2013 [0]PETSC ERROR: Libraries linked from /home/bkhanal/Documents/softwares/petsc-3.4.3/arch-linux2-cxx-debug/lib [0]PETSC ERROR: Configure run at Wed Oct 16 15:13:05 2013 [0]PETSC ERROR: Configure options --download-mpich -download-f-blas-lapack=1 --download-metis --download-parmetis --download-superlu_dist --download-scalapack --download-mumps --download-hypre --with-clanguage=cxx [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: PetscMallocAlign() line 46 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/sys/memory/mal.c [0]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 3551 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: MatSeqAIJSetPreallocation() line 3496 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: MatMPIAIJSetPreallocation_MPIAIJ() line 3307 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: MatMPIAIJSetPreallocation() line 4015 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: DMCreateMatrix_DA_3d_MPIAIJ() line 1101 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/impls/da/fdda.c [0]PETSC ERROR: DMCreateMatrix_DA() line 771 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/impls/da/fdda.c [0]PETSC ERROR: DMCreateMatrix() line 910 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/dm/interface/dm.c [0]PETSC ERROR: KSPSetUp() line 207 in /home/bkhanal/Documents/softwares/petsc-3.4.3/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: solveModel() line 128 in "unknowndirectory/"/user/bkhanal/home/works/AdLemModel/src/PetscAdLemTaras3D.cxx --9345:0:aspacem Valgrind: FATAL: VG_N_SEGMENTS is too low. --9345:0:aspacem Increase it and rebuild. Exiting now. --9344:0:aspacem Valgrind: FATAL: VG_N_SEGMENTS is too low. --9344:0:aspacem Increase it and rebuild. Exiting now. --9343:0:aspacem Valgrind: FATAL: VG_N_SEGMENTS is too low. --9343:0:aspacem Increase it and rebuild. Exiting now. --9346:0:aspacem Valgrind: FATAL: VG_N_SEGMENTS is too low. --9346:0:aspacem Increase it and rebuild. Exiting now. =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = EXIT CODE: 1 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES ===================================================================================
