It looks like the buffering operations consume about 15% as much time as the
allreduce operations.  Not huge, but not trivial, all the same.  Is there
any way to avoid the buffering step?



On Thu, Sep 24, 2009 at 6:03 PM, Eugene Loh <eugene....@sun.com> wrote:

>  Greg Fischer wrote:
>
> (I apologize in advance for the simplistic/newbie question.)
>
> I'm performing an ALLREDUCE operation on a multi-dimensional array.  This
> operation is the biggest bottleneck in the code, and I'm wondering if
> there's a way to do it more efficiently than what I'm doing now.  Here's a
> representative example of what's happening:
>
>    ir=1
>    do ikl=1,km
>      do ij=1,jm
>        do ii=1,im
>          albuf(ir)=array(ii,ij,ikl,nl,0,ng)
>          ir=ir+1
>        enddo
>      enddo
>    enddo
>    agbuf=0.0
>    call
> mpi_allreduce(albuf,agbuf,im*jm*kmloc(coords(2)+1),mpi_real,mpi_sum,ang_com,ierr)
>    ir=1
>    do ikl=1,km
>      do ij=1,jm
>        do ii=1,im
>          phim(ii,ij,ikl,nl,0,ng)=agbuf(ir)
>          ir=ir+1
>        enddo
>      enddo
>    enddo
>
> Is there any way to just do this in one fell swoop, rather than buffering,
> transmitting, and unbuffering?  This operation is looped over many times.
> Are there savings to be had here?
>
> There are three steps here:  buffering, transmitting, and unbuffering.  Any
> idea how the run time is distributed among those three steps?  E.g., if most
> time is spent in the MPI call, then combining all three steps into one is
> unlikely to buy you much... and might even hurt.  If most of the time is
> spent in the MPI call, then there may be some tuning of collective
> algorithms to do.  I don't have any experience doing this with OMPI.  I'm
> just saying it makes some sense to isolate the problem a little bit more.
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to