I'm trying to get a better preallocation scheme. I noticed an earlier e-mail
<https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2010-September/006886.html>
mentioning that "for finite elements with a normal cell division or finite
differences with a normal vertex division, you need only count the nonzeros for
rows you own since the partitions overlap on the boundary." Unfortunately, this
doesn't quite work for me. Here's the latest attempt at a routine, with some
details skipped:
std::vector<PetscInt> d_nnz(nLocalRows), o_nnz(nLocalRows);
std::vector<std::set<PetscInt> > d_nzInds(nLocalRows), o_nzInds(nLocalRows);
for ( ... loop over elements ...) {
wrapNodeIds(*mesh, nIdVec); // Accounts for periodic boundary conditions
mesh->nodes2globalDOF(nIdVec, globalDOFsPerElem);
std::vector<PetscInt>::iterator gdBegin = globalDOFsPerElem.begin();
std::vector<PetscInt>::iterator gdEnd = globalDOFsPerElem.end();
for (std::vector<PetscInt>::iterator itrI = gdBegin; itrI != gdEnd; ++itrI) {
PetscInt I = *itrI;
if ((I >= minDOFId) && (I <= maxDOFId)) {
PetscInt Ioffset = I - minDOFId;
for (std::vector<PetscInt>::iterator itrJ = gdBegin; itrJ != gdEnd;
++itrJ) {
PetscInt J = *itrJ;
if ((J >= minDOFId) && (J <= maxDOFId)) {
// J will not be inserted if it is already in d_nzInds[Ioffset].
This avoids duplicates.
d_nzInds[Ioffset].insert(J);
}
else {
o_nzInds[Ioffset].insert(J);
}
}
}
}
for (PetscInt i = 0; i < nLocalRows; ++i) {
d_nnz[i] = d_nzInds[i].size();
o_nnz[i] = o_nzInds[i].size();
}
For nodes in the interior, this seems to work well, but at the boundaries of
the domain, the number of non-zeros is underestimated. It seems like parallel
communication is unavoidable here, but I'm not sure how to do it, and I'm
unsure if I really want to mess with relatively low-level MPI routines to get
what I want.
Any ideas?