On Fri, 23 Apr 2010 at 11:29:53, George Bosilca wrote:
If you use any kind of high performance network that require memory
registration for communications, then this high cost for the
MPI_Alloc_mem will be hidden by the communications. However, the
MPI_Alloc_mem function seems horribly complicated to me, as we do the
whole "find-the-right-allocator" step every time instead of caching
it. While this might be improved, I'm pretty sure the major part of
the overhead comes from the registration itself.
The MPI_Alloc_mem function allocate the memory and then it register it
with the high speed interconnect (Infiniband as an example). If you
don't have IB, then this should not happens. You can try to force the
mpool to nothing, or disable the pinning
(mpi_leave_pinned=0,mpi_leave_pinned_pipeline=0) to see if this affect
the performances.
I have an IB cluster with 32 cores nodes. A big part of my
communications is done through sm, so registering systematically buffers
with IB is killing performance for nothing.
Following your tip, I disabled the pinning (using "mpirun -mca
mpi_leave_pinned 0 -mca mpi_leave_pinned_pipeline 0)".
The cycle (MPI_Alloc_mem/MPI_Free_mem) takes now 120 us, while
(malloc/free) takes 1 us.
In all cases, a program calling MPI_Sendrecv_replace() is hardly
penalized by these calls to MPI_Alloc_mem/MPI_Free_mem.
That's why I proposed to come back to the malloc/free scheme in this
routine.
Pascal
george.
On Apr 22, 2010, at 08:50 , Pascal Deveze wrote:
Hi all,
The sendrecv_replace in Open MPI seems to allocate/free memory with
MPI_Alloc_mem()/MPI_Free_mem()
I measured the time to allocate/free a buffer of 1MB.
MPI_Alloc_mem/MPI_Free_mem take 350us while malloc/free only take 8us.
malloc/free in ompi/mpi/c/sendrecv_replace.c was replaced by
MPI_Alloc_mem/MPI_Free_mem with this commit :
user: twoodall
date: Thu Sep 22 16:43:17 2005 0000
summary: use MPI_Alloc_mem/MPI_Free_mem for internally allocated
buffers
Is there a real reason to use these functions or can we move back to
malloc/free ?
Is there a problem on my configuration explaining such slow
performance with MPI_Alloc_mem ?
Pascal
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel