Re: [OMPI users] MPI_PACK very slow?

Michael Tue, 6 Mar 2007 16:51:32 -0500

I discovered I made a minor change that cost me dearly (I had thoughtI had tested this single change but perhaps didn't track the timingdata closely).

MPI_Type_creat_struct performs well only when all the data iscontinuous in memory (at least for OpenMPI 1.1.2).


Is this normal or expected?

In my case the program has a f90 structure with 11 integers, 2logicals, and five 50 element integer arrays. But at the first stageof the program only the first element of those arrays are used. Butusing MPI_Type_create_struct it is more efficient to send the entire263 words of continuous memory (58 sec's) than to try and send only18 words of noncontinuous memory (64 sec's). At the second stageit's 33 words and at that stage it becomes 47 sec's vs. 163 sec's, anextra 116 seconds, which dominates the push of my overall wall clocktime from 130 to 278 seconds. The third stage increases from 13seconds to 37 seconds.

Because I need to send this block of data back and forward a lot Iwas hoping to find a way to speed up this data transfer of this oddblock of data and a couple other variables. I may try PACK andUNPACK on the structure, but calling those lots of times can't bemore efficient.

Previously I was equivalencing the structure to a integer array andsending the integer array as a fast dirty solution to get started andit worked. Not completely portable no doubt.


Michael

ps. I don't currently have valgrind installed on this cluster andvalgrind is not part of the Debian Linux 3.1r3 distribution. Withoutany experience with valgrind I'm not sure how useful valgrind willbe with a MPI program of 500+ subroutines and 50K+ lines running on16 processes. It took us a bit to get profiling working for theOpenMP version of this code.


On Mar 6, 2007, at 11:28 AM, George Bosilca wrote:

I doubt this come from the MPI_Pack/MPI_Unpack. The difference is 137
seconds for 5 calls. That's basically 27 seconds by call to MPI_Pack,
for packing 8 integers. I know the code and I'm affirmative there is
no way to spend 27 seconds over there.

Can you run your application using valgrind with the callgrind tool.
This will give you some basic informations about where the time is
spend. This will give us additional information about where to look.

   Thanks,
     george.

On Mar 6, 2007, at 11:26 AM, Michael wrote:

I have a section of code were I need to send 8 separate integers via
BCAST.

Initially I was just putting the 8 integers into an array and then
sending that array.

I just tried using MPI_PACK on those 8 integers and I'm seeing a
massive slow down in the code, I have a lot of other communication
and this section is being used only 5 times.  I went from 140 seconds
to 277 seconds on 16 processors using TCP via a dual gigabit ethernet
setup (I'm the only user working on this system today).

This was run with OpenMPI 1.1.2 to maintain compatibility with a
major HPC site.

Is there a know problem with MPI_PACK/UNPACK in OpenMPI?

Michael

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


"Half of what I say is meaningless; but I say it so that the other
half may reach you"
                                   Kahlil Gibran


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] MPI_PACK very slow?

Reply via email to