Christopher Samuel wrote: > Subject: Re: [OMPI devel] Very poor performance with btl sm on twin > nehalem servers with Mellanox ConnectX installed > To: de...@open-mpi.org > Message-ID: > <d45958078cd65c429557b4c5f492b6a60770e...@is-ex-bev3.unimelb.edu.au> > Content-Type: text/plain; charset="iso-8859-1" > > On 13/05/10 20:56, Oskar Enoksson wrote: > > >> The problem is that I get very bad performance unless I >> explicitly exclude the "sm" btl and I can't figure out why. >> > Recently someone reported issues which were traced back to > the fact that the files that sm uses for mmap() were in a > /tmp which was NFS mounted; changing the location where their > files were kept to another directory with the orte_tmpdir_base > MCA parameter fixed that issue for them. > > Could it be similar for yourself ? > > cheers, > Chris > That was exactly right, as you guessed these are diskless nodes that mounts the root filesystem over NFS.
Setting orte_tmpdir_base to /dev/shm and btl_sm_num_fifos=9 and then running mpi_stress on eight cores measures speeds of 1650MB/s for both 1MB messages and 1600MB/s for 10kB messages. Thanks! /Oskar