Hi Terry,

        I feel hierarchical collectives are slower compare to tuned one. I 
had done some benchmark in the past specific to collectives, and this is 
what i feel based on my observation.

Regards

Neeraj Chourasia (MTS)
Computational Research Laboratories Ltd.
(A wholly Owned Subsidiary of TATA SONS Ltd)
B-101, ICC Trade Towers, Senapati Bapat Road
Pune 411016 (Mah) INDIA
(O) +91-20-6620 9863  (Fax) +91-20-6620 9862
M: +91.9225520634




Terry Dontje <terry.don...@sun.com> 
Sent by: users-boun...@open-mpi.org
08/07/2009 04:35 PM
Please respond to
Open MPI Users <us...@open-mpi.org>


To
us...@open-mpi.org
cc

Subject
Re: [OMPI users] Performance question about OpenMPI and MVAPICH2        on 
IB






Craig,

Did your affinity script bind the processes per socket or linearly to 
cores.  If the former you'll want to look at using rankfiles and place the 
ranks based on sockets.  TWe have found this especially useful if you are 
not running fully subscribed on your machines.

Also, if you think the main issue is collectives performance you may want 
to try using the hierarchical and SM collectives.  However, be forewarned 
we are right now trying to pound out some errors with these modules.  To 
enable them you add the following parameters "--mca coll_hierarch_priority 
100 --mca coll_sm_priority 100".  We would be very interested in any 
results you get (failures, improvements, non-improvements).

thanks,

--td

> Message: 4
> Date: Thu, 06 Aug 2009 17:03:08 -0600
> From: Craig Tierney <craig.tier...@noaa.gov>
> Subject: Re: [OMPI users] Performance question about OpenMPI and
>                MVAPICH2 on             IB
> To: Open MPI Users <us...@open-mpi.org>
> Message-ID: <4a7b612c.8070...@noaa.gov>
> Content-Type: text/plain; charset=ISO-8859-1
>
> A followup....
>
> Part of problem was affinity.  I had written a script to do processor
> and memory affinity (which works fine with MVAPICH2).  It is an
> idea that I got from TACC.  However, the script didn't seem to
> work correctly with OpenMPI (or I still have bugs).
>
> Setting --mca mpi_paffinity_alone 1 made things better.  However,
> the performance is still not as good:
>
> Cores   Mvapich2    Openmpi
> ---------------------------
>    8      17.3        17.3
>   16      31.7        31.5
>   32      62.9        62.8
>   64     110.8       108.0
>  128     219.2       201.4
>  256     384.5       342.7
>  512     687.2       537.6
>
> The performance number is GFlops (so larger is better).
>
> The first few numbers show that the executable is the right
> speed.  I verified that IB is being used by using OMB and
> checking latency and bandwidth.  Those numbers are what I
> expect (3GB/s, 1.5mu/s for QDR).
>
> However, the Openmpi version is not scaling as well.  Any
> ideas on why that might be the case?
>
> Thanks,
> Craig

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


=====-----=====-----=====



Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. 

Internet communications cannot be guaranteed to be timely,
secure, error or virus-free. The sender does not accept liability
for any errors or omissions.Thank you

=====-----=====-----=====

Reply via email to