On Fri, 23 Apr 2010 at 11:29:53, George Bosilca wrote:

If you use any kind of high performance network that require memory registration for communications, then this high cost for the MPI_Alloc_mem will be hidden by the communications. However, the MPI_Alloc_mem function seems horribly complicated to me, as we do the whole "find-the-right-allocator" step every time instead of caching it. While this might be improved, I'm pretty sure the major part of the overhead comes from the registration itself. The MPI_Alloc_mem function allocate the memory and then it register it with the high speed interconnect (Infiniband as an example). If you don't have IB, then this should not happens. You can try to force the mpool to nothing, or disable the pinning (mpi_leave_pinned=0,mpi_leave_pinned_pipeline=0) to see if this affect the performances.

I have an IB cluster with 32 cores nodes. A big part of my communications is done through sm, so registering systematically buffers with IB is killing performance for nothing. Following your tip, I disabled the pinning (using "mpirun -mca mpi_leave_pinned 0 -mca mpi_leave_pinned_pipeline 0)". The cycle (MPI_Alloc_mem/MPI_Free_mem) takes now 120 us, while (malloc/free) takes 1 us.

In all cases, a program calling MPI_Sendrecv_replace() is hardly penalized by these calls to MPI_Alloc_mem/MPI_Free_mem. That's why I proposed to come back to the malloc/free scheme in this routine.

Pascal
  george.

On Apr 22, 2010, at 08:50 , Pascal Deveze wrote:

Hi all,

The sendrecv_replace in Open MPI seems to allocate/free memory with MPI_Alloc_mem()/MPI_Free_mem()

I measured the time to allocate/free a buffer of 1MB.
MPI_Alloc_mem/MPI_Free_mem take 350us while malloc/free only take 8us.

malloc/free in ompi/mpi/c/sendrecv_replace.c was replaced by MPI_Alloc_mem/MPI_Free_mem with this commit :

user:        twoodall
date:        Thu Sep 22 16:43:17 2005 0000
summary: use MPI_Alloc_mem/MPI_Free_mem for internally allocated buffers

Is there a real reason to use these functions or can we move back to malloc/free ? Is there a problem on my configuration explaining such slow performance with MPI_Alloc_mem ?

Pascal
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to