Prentice Bisbal wrote:
Jeff Squyres wrote:
  
On Feb 2, 2009, at 4:48 PM, Prentice Bisbal wrote
    
No. I was running just a simple "Hello, world" program to test v1.3 when
these errors occured. And as soon as I reverted to 1.2.8, the errors
disappeared.
      
FWIW, OMPI allocates shared memory based on the number of peers on the
host.  This allocation is during MPI_INIT, not during the first
MPI_SEND/MPI_RECV call.  So even if you're running "hello world", you
could still be running out of shared memory space.
    
Thanks for the info. Can you define peers for me?
  
The number of MPI processes running on a shared-memory node.

Let's say the number of MPI processes in your job on a shared-memory node is n.  If n=1, there is no use for shared memory and no file will be created.  But if n>2, then the formula for the file size depends on some MCA parameters.  The size is something like

size = mpool_sm_per_peer_size * n
if ( size < mpool_sm_min_size ) size = mpool_sm_min_size
if ( size < mpool_sm_max_size ) size = mpool_sm_max_size

The defaults are per_peer=32MB, min=128MB, and max=512 MB (I think).

I guess the question is whether a file of at least 100s of MB can be created in /tmp on the nodes where you're running.


Reply via email to