Re: [OMPI users] SM btl slows down bandwidth?

Gus Correa Tue, 12 Aug 2008 19:53:28 -0400

Hello Daniel and list

Could it be a problem with memory bandwidth / contention in multi-core?
It has been reported in many mailing lists (mpich, beowulf, etc).

Here it seems to happen in dual-processor dual-core with our memoryintensive programs.


Have you checked what happens to the shared memory runs as you
you increase the number of active cores/processes?
Would it help to set the processor affinity in the shared memory runs?

http://www.open-mpi.org/faq/?category=building#build-paffinity
http://www.open-mpi.org/faq/?category=tuning#using-paffinity

Gus Correa

--
---------------------------------------------------------------------
Gustavo J. Ponce Correa, PhD - Email: g...@ldeo.columbia.edu
Lamont-Doherty Earth Observatory - Columbia University
P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------


Daniël Mantione wrote:

Hello,

I'm troubleshooting a weird benchmark situation that having the sm btlenabled gives me worse results than disabling it.

For example, this on a single compute node with 2*Xeon5420, 8 GB RAM and aConnectX gen2 IB card, with OFED 1.3 and OpenMPI 1.2.6 as software setup:


[cvsupport@extern src]$ mpirun -np 8 --mca btl self,sm,openib -hostfile \
hostfile ./IMB-MPI1.openmpi -npmin 8 PingPong

#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
      #bytes #repetitions      t[usec]   Mbytes/sec
           0         1000         0.87         0.00
           1         1000         0.98         0.97
           2         1000         0.97         1.96
           4         1000         0.99         3.87
           8         1000         0.98         7.78
          16         1000         1.15        13.33
          32         1000         1.13        26.93
          64         1000         1.12        54.42
         128         1000         1.27        96.31
         256         1000         1.55       157.01
         512         1000         2.04       239.00
        1024         1000         2.75       355.62
        2048         1000         4.58       426.40
        4096         1000         7.12       548.93
        8192         1000        11.29       692.14
       16384         1000        18.83       829.75
       32768         1000        34.57       904.08
       65536          640        60.73      1029.22
      131072          320       112.06      1115.43
      262144          160       215.48      1160.21
      524288           80       423.34      1181.09
     1048576           40       858.18      1165.26
     2097152           20      1744.15      1146.69
     4194304           10      4055.60       986.29

Now, when disabling the sm btl, the score is:

#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
# ( 6 additional processes waiting in MPI_Barrier)
#---------------------------------------------------
      #bytes #repetitions      t[usec]   Mbytes/sec
           0         1000         1.08         0.00
           1         1000         1.42         0.67
           2         1000         1.19         1.60
           4         1000         1.21         3.14
           8         1000         1.61         4.75
          16         1000         1.30        11.70
          32         1000         1.32        23.13
          64         1000         1.61        37.97
         128         1000         2.80        43.53
         256         1000         3.21        76.05
         512         1000         4.06       120.15
        1024         1000         5.03       194.21
        2048         1000         7.15       273.05
        4096         1000        10.05       388.55
        8192         1000        16.02       487.76
       16384         1000        29.63       527.41
       32768         1000        51.23       610.03
       65536          640        92.26       677.43
      131072          320       141.03       886.36
      262144          160       233.62      1070.14
      524288           80       434.56      1150.60
     1048576           40       818.84      1221.24
     2097152           20      1403.75      1424.76
     4194304           10      2523.40      1585.16

Now, I do have fast Infiniband, but I can't believe that the openib btl issupposed to be faster than the sm btl. Does anyone know wethersomething can be tuned here?


Best regards,

Daniël Mantione

------------------------------------------------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] SM btl slows down bandwidth?

Reply via email to