An update on this. We have been using mark/gamg-zerod branch and it
fixes for us the issue with zero diagonals in the coarsened operators
(making the sor smoother, but bizarrely not the jacobi smoother fail).
We in fact have some cases where cheby+jacobi does not converge
(indefinite pc), but cheby+sor (with the mark/gamg-zerod branch) works
well, so we'd be very much interested in getting this (or something
similar) merged in master. Maybe the lv[i-Istart]==0.0 thing isn't
entirely robust? We'd be happy to contribute.
As an aside, we also changed the ordering of DOFs as suggested, so that
we provide the right block structure to gamg. However, as soon as we
actually set the block size (MatSetBlockSizes) the convergence
deteriorates substantially (going from ~50 to ~650 iterations). Without
setting the block size but with the new ordering, the n/o iterations is
roughly the same as before (when our dofs were not interlaced). Any idea
what might be going wrong?
Cheers
Stephan
On 01/04/14 19:17, Jed Brown wrote:
Stephan Kramer <[email protected]> writes:
Yes indeed. I've come to realize this now by looking into how smoothed
aggregation with a near null space actually works. We currently have
our dofs numbered the wrong way around (vertices on the inside,
velocity component on the outside - which made sense for other eqns we
solve with the model) so will take a bit of work, but might well be
worth the effort
The memory streaming and cache reuse is much better if you interlace the
degrees of freedom. This is as true now as it was at the time of the
PETSc-FUN3D papers. When evaluating the "physics", it can be useful to
pack the interlaced degrees of freedom into a vector-friendly ordering.
The AMG solve is plenty expensive that you can pack/solve/unpack an
interlaced vector at negligible cost without changing the rest of your
code.
Mark, should we provide some more flexible way to label "fields"? It
will be more complicated than the present code and I think packing into
interlaced format is faster anyway.