[OMPI users] Send Recv Bandwidth

2013-03-11 Thread Nilesh Mahajan
Hi, I was comparing a simple Send-Recv program to another program with two memcpy's to/from shared memory. Number of processes = 2 and different array sizes (from 10^6 - 10^8 doubles) on IA64. With the --mca btl sm,self options I get almost twice the bandwidth compared to the two memcpy's. I

Re: [OMPI users] Shared Memory Collectives

2011-12-19 Thread Nilesh Mahajan
Hi, I am trying to implement the following collectives in MPI sharedmemory, Alltoall, Broadcast, Reduce with zero copy optimizations.So for Reduce, my compiler allocates all the send buffers in sharedmemory (mmap anonymous), and allocates only one receive buffer againin shared memory. Then all the