On May 31, 2013, at 4:04 AM, Lorenzo Alessio Botti <lorenzoalessiobo...@gmail.com> wrote:
> I don't know if it is the "correct" way to do it but the memory footprint has > not changed. > I'm also interested in knowing if the memory savings take place on the > libMesh side or on the PETSc side, or on both sides. That's what I would have done, for what it's worth. The memory savings with this most recent patch is purely on the PETSc side. Specifically: - Back in December when we introduced "VariableGroups" there was an appreciable memory savings in the DofObject by reducing the number of variable indices that we store. The forward plan is to also optimize the DofMap for the case of VariableGroups, where sparsity pattern generation could be more efficient. However, this is not done yet. We still generate the sparsity pattern and constraints at the per-DOF level. - This most recent change boils down to just this, in petsc_matrix.C: #ifdef LIBMESH_ENABLE_BLOCKED_STORAGE if (blocksize > 1) { // specified blocksize, bs>1. // double check sizes. libmesh_assert_equal_to (m_local % blocksize, 0); libmesh_assert_equal_to (n_local % blocksize, 0); libmesh_assert_equal_to (m_global % blocksize, 0); libmesh_assert_equal_to (n_global % blocksize, 0); ierr = MatSetType(_mat, MATBAIJ); // Automatically chooses seqbaij or mpibaij LIBMESH_CHKERRABORT(ierr); ierr = MatSetBlockSize(_mat, blocksize); LIBMESH_CHKERRABORT(ierr); // transform the per-entry n_nz and n_oz arrays into their block counterparts. std::vector<numeric_index_type> b_n_nz, b_n_oz; transform_preallocation_arrays (blocksize, n_nz, n_oz, b_n_nz, b_n_oz); ierr = MatSeqBAIJSetPreallocation(_mat, blocksize, 0, (PetscInt*)(b_n_nz.empty()?NULL:&b_n_nz[0])); LIBMESH_CHKERRABORT(ierr); ierr = MatMPIBAIJSetPreallocation(_mat, blocksize, 0, (PetscInt*)(b_n_nz.empty()?NULL:&b_n_nz[0]), 0, (PetscInt*)(b_n_oz.empty()?NULL:&b_n_oz[0])); LIBMESH_CHKERRABORT(ierr); } else #endif The function 'transform_preallocation_arrays() is defined in that file and simply takes the usual per-dof n_nz, n_oz libMesh-computed arrays and transforms them into their per-block counterpart. So you see the most recent memory savings is almost entirely PETSc. Try running with the -log_summary command line option too… For a sparse matrix, PETSc must store the graph of it as well. Without blocked DOFS, the size of the graph is the same as the number of nonzeros in the matrix. With blocked DOFs, each graph entry represents an NB*NB dense matrix. So there is a memory savings there for sure, but at some point you'll still be dominated by storing the coefficients. Maybe that's already happening in your case? The other thing this allows for is more efficient element matrix insertion. Rather than traversing the graph at the scalar value level it can be done at the block index level. This is accessed via libmesh using jacobian->add_block_matrix (dRdUe, blocked_dof_indices); I haven't advertised this much yet… ------------------------------------------------------------------------------ Get 100% visibility into Java/.NET code with AppDynamics Lite It's a free troubleshooting tool designed for production Get down to code-level detail for bottlenecks, with <2% overhead. Download for free and get started troubleshooting in minutes. http://p.sf.net/sfu/appdyn_d2d_ap2 _______________________________________________ Libmesh-devel mailing list Libmesh-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libmesh-devel