On Wed, Aug 20, 2014 at 4:17 PM, Jed Brown <[email protected]> wrote:
> Matthew Knepley <[email protected]> writes: > >> In theory, with this one could write very efficient parallel 3d > Poisson > >> solvers in PETSc for boxes, which is an important special case that > PETSc > >> does not currently support. > > > > > > Wouldn't you just use MG? > > FFT is O(n log n) and O(n/p (log n) (log p)) parallel complexity while > MG is O(n) and O(n/p log p + log^2 n). The presence of the (log n)^2 > term is one reason some people like to say that MG won't scale, but FFT > or FMM will (Edelman's 1997 paper forecasts that FFT will eventually be > replaced by FMM). > > FMG for a second order discretization is okay with a Cheby/Jacobi 2,1 > cycle, so 4 visits to the fine grid and less than 5 work units in total > (you can do slightly better with GS). If the per-process size is large, > fine-grid memory bandwidth to apply the stencil operations is the > primary cost, followed by a couple more visits of vector work, mostly > because fusing operations is inconvenient. How does this compare with > the log n term in FFT complexity? > I was thinking that the 3x would dissolve pretty quickly for the problem sizes being talked about in the blurb using the analysis above, but I guess we need to check. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
