Date: Fri, 07 Aug 2009 07:12:45 -0600
From: Craig Tierney <craig.tier...@noaa.gov>
Subject: Re: [OMPI users] Performance question about OpenMPI and
        MVAPICH2 on     IB
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <4a7c284d.3040...@noaa.gov>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

Terry Dontje wrote:
> Craig,
> > Did your affinity script bind the processes per socket or linearly to > cores. If the former you'll want to look at using rankfiles and place > the ranks based on sockets. TWe have found this especially useful if > you are not running fully subscribed on your machines. >

The script binds them to sockets and also binds memory per node.
It is smart enough that if the machine_file does not use all
the cores (because the user reordered them) then the script will
lay out the tasks evenly between the two sockets.
Ok, so you'll probably want to look at using rankfile (described in the mpirun manpage) because mpi_paffinity_alone just does a linear binding (rank 0 to cpu0, rank 1 to cpu 1...).

> Also, if you think the main issue is collectives performance you may > want to try using the hierarchical and SM collectives. However, be > forewarned we are right now trying to pound out some errors with these > modules. To enable them you add the following parameters "--mca > coll_hierarch_priority 100 --mca coll_sm_priority 100". We would be > very interested in any results you get (failures, improvements, > non-improvements). >

I don't know what it is slow.  OpenMPI is so flexible in how the
stack can be tuned.  But I also have 100s of users runing dozens
of major codes, and what I need is a set of options that 'just work'
in most cases.

I will try the above options and get back to you.

Ok, thanks.

--td

Reply via email to