On Feb 21, 2012, at 4:30 PM, Jed Brown wrote: > On Tue, Feb 21, 2012 at 15:37, Barry Smith <bsmith at mcs.anl.gov> wrote: > Why not just have MatAssemblyBegin_Nest() call the inner > MatAssemblyBegin/End() together and stop the charade that there is any > overlap of communication and computations etc anyway? > > (I pushed this.) > > So there is a very real latency issue in matrix assembly that comes from the > reduction to determine how many receives are necessary. Due to MPI > limitations, that code (PetscGatherNumberOfMessages() and > PetscGatherMessageLengths()) is synchronizing, but MPI-3 will offer > non-blocking collectives
Oh, you mean using pthreads to spawn a thread that waits at blocking collective? :-) BTW; do you really think MPI-3 will exist? Or it should exist? Barry > that we could use for those operations. Now the two entrance points > (MatAssemblyBegin() and MatAssemblyEnd()) are not sufficient to make progress > on this task of assembly without also having an internal request system > (where either a comm thread or callbacks from other library functions poked > the progress along). > > There are also signs that sometime soon it will be common to have a comm > thread that manages packing, in which case communication could actually start > happening concurrently with computation. >
