Mathieu Gontier wrote:

  Dear OpenMPI users

I am dealing with an arithmetic problem. In fact, I have two variants of my code: one in single precision, one in double precision. When I compare the two executable built with MPICH, one can observed an expected difference of performance: 115.7-sec in single precision against 178.68-sec in double precision (+54%).

The thing is, when I use OpenMPI, the difference is really bigger: 238.5-sec in single precision against 403.19-sec double precision (+69%).

Our experiences have already shown OpenMPI is less efficient than MPICH on Ethernet with a small number of processes. This explain the differences between the first set of results with MPICH and the second set with OpenMPI. (But if someone have more information about that or even a solution, I am of course interested.) But, using OpenMPI increases the difference between the two arithmetic. Is it the accentuation of the OpenMPI+Ethernet loss of performance, is it another issue into OpenMPI or is there any option a can use?

It is also unusual that the performance difference between MPICH and OMPI is so large. You say that OMPI is slower than MPICH even at small process counts. Can you confirm that this is because MPI calls are slower? Some of the biggest performance differences I've seen between MPI implementations had nothing to do with the performance of MPI calls at all. It had to do with process binding or other factors that impacted the computational (non-MPI) performance of the code. The performance of MPI calls was basically irrelevant.

In this particular case, I'm not convinced since neither OMPI nor MPICH binds processes by default.

Still, can you do some basic performance profiling to confirm what aspect of your application is consuming so much time? Is it a particular MPI call? If your application is spending almost all of its time in MPI calls, do you have some way of judging whether the faster performance is acceptable? That is, is 238 secs acceptable and 403 secs slow? Or, are both timings unacceptable -- e.g., the code "should" be running in about 30 secs.

Reply via email to