Matthew Knepley <[email protected]> writes:
>>   In theory, with this one could write very efficient parallel 3d Poisson
>> solvers in PETSc for boxes, which is an important special case that PETSc
>> does not currently support.
>
>
> Wouldn't you just use MG?

FFT is O(n log n) and O(n/p (log n) (log p)) parallel complexity while
MG is O(n) and O(n/p log p + log^2 n).  The presence of the (log n)^2
term is one reason some people like to say that MG won't scale, but FFT
or FMM will (Edelman's 1997 paper forecasts that FFT will eventually be
replaced by FMM).

FMG for a second order discretization is okay with a Cheby/Jacobi 2,1
cycle, so 4 visits to the fine grid and less than 5 work units in total
(you can do slightly better with GS).  If the per-process size is large,
fine-grid memory bandwidth to apply the stencil operations is the
primary cost, followed by a couple more visits of vector work, mostly
because fusing operations is inconvenient.  How does this compare with
the log n term in FFT complexity?

Attachment: pgpIJE2o9fnkL.pgp
Description: PGP signature

Reply via email to