The matrix resides on disk. It was generated by a single-process program. Its 
purpose is for comparing those results with those generated by a PETSc-based 
multi-process program. The current approach works well for small and 
medium-sized matrices but not for the large matrix. 

What I can do is let each process determine which rows it holds locally. Then 
each process can read its rows and populate its part of the matrix. Just a bit 
more code. Not a big problem.

Thank you very much Barry for your input. Let me assure you that I have no 
intention of faking or hacking.  :)   This project is too important to our 
transition from shared-memory machines. (See 
http://www.afrl.hpc.mil/hardware/hawk.php.)


Best,

Peter.

Peter G. Raeth, Ph.D.
Senior Staff Scientist
Signal and Image Processing
High Performance Technologies, Inc
937-904-5147
praeth at hpti.com

________________________________________
From: petsc-users-bounces at mcs.anl.gov [petsc-users-bounces at mcs.anl.gov] 
on behalf of Barry Smith [[email protected]]
Sent: Monday, January 24, 2011 4:23 PM
To: PETSc users list
Subject: Re: [petsc-users] Out of memory during MatAssemblyBegin

On Jan 24, 2011, at 3:08 PM, Raeth, Peter wrote:

> Am running out of memory while using MatAssemblyBegin on a dense matrix that 
> spans several processors. My calculations show that the matrices I am using 
> do not require more than 25% of available memory.
>
> Different about this matrix compared to the others is that the program runs 
> out of memory after the matrix has been populated by a single process, rather 
> than by multiple processes. Used MatSetValues. Since the values are held in 
> cache until MatAssemblyEnd is called (as I understand things), is it possible 
> that using one process to populate the entire matrix is causing this problem?


   Yes, absolutely, this is a terrible non-scalable way of filling a parallel 
matrix. You can fake it by calling MatAssemblyBegin/End() repeatedly with the 
flag MAT_FLUSH_ASSEMBLY to keep the stash from getting too big. But you really 
need a much better way of setting values into the matrix. How are these 
"brought in row by row" matrix entries generated?

   Barry


> The data is brought in only row by row for the population process. All buffer 
> memory is cleared before the call to MatAssemblyBegin.
>
> The error dump contains:
>
> mpirun -prefix [%g]   -np 256 Peter.x
> [0]  [0]PETSC ERROR: --------------------- Error Message 
> ------------------------------------
> [0]  [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]  [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]  [0]PETSC ERROR: destroying unneeded objects.
> [0]  [0]PETSC ERROR: Memory allocated 1372407920 Memory used by process 
> -122585088
> [0]  [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]  [0]PETSC ERROR: Memory requested 18446744071829395456!
> [0]  [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]  [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 6, Tue Nov 16 
> 17:02:32 CST 2010
> [0]  [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]  [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]  [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]  [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]  [0]PETSC ERROR: Peter.x on a linux-int named hawk-6 by praeth Mon Jan 24 
> 15:44:28 2011
> [0]  [0]PETSC ERROR: Libraries linked from 
> /default/praeth/MATH/petsc-3.1-p6/linux-intel-g/lib
> [0]  [0]PETSC ERROR: Configure run at Tue Dec 21 08:45:25 2010
> [0]  [0]PETSC ERROR: Configure options --download-superlu=1 
> --download-parmetis=1 --download-superlu_dist=1 --with-debugging=1 
> --with-error-checking=1 -PETSC_ARCH=linux-intel-g --with-fc="ifort -lmpi" 
> --with-cc="icc -lmpi" --with-gnu-compilers=false
> [0]  [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]  [0]PETSC ERROR: PetscMallocAlign() line 49 in src/sys/memory/mal.c
> [0]  [0]PETSC ERROR: PetscTrMallocDefault() line 192 in src/sys/memory/mtr.c
> [0]  [0]PETSC ERROR: MatStashScatterBegin_Private() line 510 in 
> src/mat/utils/matstash.c
> [0]  [0]PETSC ERROR: MatAssemblyBegin_MPIDense() line 286 in 
> src/mat/impls/dense/mpi/mpidense.c
> [0]  [0]PETSC ERROR: MatAssemblyBegin() line 4564 in 
> src/mat/interface/matrix.c
> [0]  [0]PETSC ERROR: User provided function() line 195 in 
> "unknowndirectory/"Peter.c
> [-1]  MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
> [-1]  MPI: aborting job
> exit
>
> Had tried to use the suggestion to employ -malloc_dump or -malloc_log but do 
> not see any result from the batch run.
>
> Thank you all for any insights you can offer.
>
>
> Best,
>
> Peter.
>

Reply via email to