Re: [petsc-users] Pipelined CG (or Gropp's CG) and communication overlap

Jed Brown Mon, 17 Mar 2014 02:23:30 -0700

Chao Yang <[email protected]> writes:

> The pipelined CG (or Gropp's CG) recently implemented in PETSc is very
> attractive since it has the ability of hiding the collective
> communication in vector dot product by overlapping it with the
> application of preconditioner and/or SpMV.
>
> However, there is an issue that may seriously degrade the
> performance. In the pipelined CG, the asynchronous MPI_Iallreduce is
> called before the application of preconditioner and/or SpMV, and then
> ended by MPI_Wait. In the application of preconditioner and/or SpMV,
> communication may also be required (such as halo updating), which I
> find is often slowed down by the unfinished MPI_Iallreduce in the
> background.
>
> As far as I know, the current MPI doesn't provide prioritized
> communication.


No, and there is not much interest in adding it because it adds
complication and tends to create starvation situations in which raising
the priority actually makes it slower.

> Therefore, it's highly possible that the performance of the pipelined
> CG may be even worse than a classic one due to the slowdown of
> preconditioner and SpMV. Is there a way to avoid this?

This is an MPI quality-of-implementation issue and there isn't much we
can do about it.  There may be MPI tuning parameters that can help, but
the nature of these methods is that in exchange for creating
latency-tolerance in the reduction, it now overlaps the neighbor
communication in MatMult/PCApply.

pgp40zyC2IS2X.pgp
Description: PGP signature

Re: [petsc-users] Pipelined CG (or Gropp's CG) and communication overlap

Reply via email to