On May 31, 2013, at 4:04 AM, Lorenzo Alessio Botti
<[email protected]> wrote:
> I don't know if it is the "correct" way to do it but the memory footprint has
> not changed.
> I'm also interested in knowing if the memory savings take place on the
> libMesh side or on the PETSc side, or on both sides.
That's what I would have done, for what it's worth.
The memory savings with this most recent patch is purely on the PETSc side.
Specifically:
- Back in December when we introduced "VariableGroups" there was an appreciable
memory savings in the DofObject by reducing the number of variable indices that
we store. The forward plan is to also optimize the DofMap for the case of
VariableGroups, where sparsity pattern generation could be more efficient.
However, this is not done yet. We still generate the sparsity pattern and
constraints at the per-DOF level.
- This most recent change boils down to just this, in petsc_matrix.C:
#ifdef LIBMESH_ENABLE_BLOCKED_STORAGE
if (blocksize > 1)
{
// specified blocksize, bs>1.
// double check sizes.
libmesh_assert_equal_to (m_local % blocksize, 0);
libmesh_assert_equal_to (n_local % blocksize, 0);
libmesh_assert_equal_to (m_global % blocksize, 0);
libmesh_assert_equal_to (n_global % blocksize, 0);
ierr = MatSetType(_mat, MATBAIJ); // Automatically chooses seqbaij or
mpibaij
LIBMESH_CHKERRABORT(ierr);
ierr = MatSetBlockSize(_mat, blocksize);
LIBMESH_CHKERRABORT(ierr);
// transform the per-entry n_nz and n_oz arrays into their block
counterparts.
std::vector<numeric_index_type> b_n_nz, b_n_oz;
transform_preallocation_arrays (blocksize,
n_nz, n_oz,
b_n_nz, b_n_oz);
ierr = MatSeqBAIJSetPreallocation(_mat, blocksize, 0,
(PetscInt*)(b_n_nz.empty()?NULL:&b_n_nz[0]));
LIBMESH_CHKERRABORT(ierr);
ierr = MatMPIBAIJSetPreallocation(_mat, blocksize,
0,
(PetscInt*)(b_n_nz.empty()?NULL:&b_n_nz[0]),
0,
(PetscInt*)(b_n_oz.empty()?NULL:&b_n_oz[0]));
LIBMESH_CHKERRABORT(ierr);
}
else
#endif
The function 'transform_preallocation_arrays() is defined in that file and
simply takes the usual per-dof n_nz, n_oz libMesh-computed arrays and
transforms them into their per-block counterpart.
So you see the most recent memory savings is almost entirely PETSc. Try
running with the -log_summary command line option too…
For a sparse matrix, PETSc must store the graph of it as well. Without blocked
DOFS, the size of the graph is the same as the number of nonzeros in the
matrix. With blocked DOFs, each graph entry represents an NB*NB dense matrix.
So there is a memory savings there for sure, but at some point you'll still be
dominated by storing the coefficients. Maybe that's already happening in your
case?
The other thing this allows for is more efficient element matrix insertion.
Rather than traversing the graph at the scalar value level it can be done at
the block index level. This is accessed via libmesh using
jacobian->add_block_matrix (dRdUe, blocked_dof_indices);
I haven't advertised this much yet…
------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Libmesh-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/libmesh-devel