On Jan 24, 2011, at 3:08 PM, Raeth, Peter wrote: > Am running out of memory while using MatAssemblyBegin on a dense matrix that > spans several processors. My calculations show that the matrices I am using > do not require more than 25% of available memory. > > Different about this matrix compared to the others is that the program runs > out of memory after the matrix has been populated by a single process, rather > than by multiple processes. Used MatSetValues. Since the values are held in > cache until MatAssemblyEnd is called (as I understand things), is it possible > that using one process to populate the entire matrix is causing this problem?
Yes, absolutely, this is a terrible non-scalable way of filling a parallel matrix. You can fake it by calling MatAssemblyBegin/End() repeatedly with the flag MAT_FLUSH_ASSEMBLY to keep the stash from getting too big. But you really need a much better way of setting values into the matrix. How are these "brought in row by row" matrix entries generated? Barry > The data is brought in only row by row for the population process. All buffer > memory is cleared before the call to MatAssemblyBegin. > > The error dump contains: > > mpirun -prefix [%g] -np 256 Peter.x > [0] [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0] [0]PETSC ERROR: Out of memory. This could be due to allocating > [0] [0]PETSC ERROR: too large an object or bleeding by not properly > [0] [0]PETSC ERROR: destroying unneeded objects. > [0] [0]PETSC ERROR: Memory allocated 1372407920 Memory used by process > -122585088 > [0] [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0] [0]PETSC ERROR: Memory requested 18446744071829395456! > [0] [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0] [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 6, Tue Nov 16 > 17:02:32 CST 2010 > [0] [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0] [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0] [0]PETSC ERROR: See docs/index.html for manual pages. > [0] [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0] [0]PETSC ERROR: Peter.x on a linux-int named hawk-6 by praeth Mon Jan 24 > 15:44:28 2011 > [0] [0]PETSC ERROR: Libraries linked from > /default/praeth/MATH/petsc-3.1-p6/linux-intel-g/lib > [0] [0]PETSC ERROR: Configure run at Tue Dec 21 08:45:25 2010 > [0] [0]PETSC ERROR: Configure options --download-superlu=1 > --download-parmetis=1 --download-superlu_dist=1 --with-debugging=1 > --with-error-checking=1 -PETSC_ARCH=linux-intel-g --with-fc="ifort -lmpi" > --with-cc="icc -lmpi" --with-gnu-compilers=false > [0] [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0] [0]PETSC ERROR: PetscMallocAlign() line 49 in src/sys/memory/mal.c > [0] [0]PETSC ERROR: PetscTrMallocDefault() line 192 in src/sys/memory/mtr.c > [0] [0]PETSC ERROR: MatStashScatterBegin_Private() line 510 in > src/mat/utils/matstash.c > [0] [0]PETSC ERROR: MatAssemblyBegin_MPIDense() line 286 in > src/mat/impls/dense/mpi/mpidense.c > [0] [0]PETSC ERROR: MatAssemblyBegin() line 4564 in > src/mat/interface/matrix.c > [0] [0]PETSC ERROR: User provided function() line 195 in > "unknowndirectory/"Peter.c > [-1] MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize() > [-1] MPI: aborting job > exit > > Had tried to use the suggestion to employ -malloc_dump or -malloc_log but do > not see any result from the batch run. > > Thank you all for any insights you can offer. > > > Best, > > Peter. >
