On Fri, Oct 7, 2016 at 10:30 PM, Jed Brown <[email protected]> wrote:

> Barry Smith <[email protected]> writes:
> >     There is still something wonky here, whether it is the MPI
> implementation or how PETSc handles the assembly. Without any values that
> need to be communicated it is unacceptably that these calls take so long.
> If we understood __exactly__ why the performance suddenly drops so
> dramatically we could perhaps fix it. I do not understand why.
>
> I guess it's worth timing.  If they don't have MPI_Reduce_scatter_block
> then it falls back to a big MPI_Allreduce.  After that, it's all
> point-to-point messaging that shouldn't suck and there actually
> shouldn't be anything to send or receive anyway.  The BTS implementation
> should be much smarter and literally reduces to a barrier in this case.
>

Hi Jed,

How to use the BTS implementation for Vec. For mat, we may just use
"-matstash_bts"?

Fande,

Reply via email to