Re: [OMPI devel] RFC: [slightly] Optimize Fortran MPI_SEND / MPI_RECV

N.M. Maclaren Sun, 8 Feb 2009 13:52:56 -0500

On Feb 7 2009, Jeff Squyres wrote:

On Feb 7, 2009, at 12:23 PM, Brian W. Barrett wrote:
That is significantly higher than I would have expected for a singlefunction call. When I did all the component tests a couple yearsago, a function call into a shared library was about 5ns on an IntelXeon (pre-Core 2 design) and about 2.5 on an AMD Opteron.
Good; I'm not crazy for thinking that this is a little too obvious --it smells like I did something wrong. Could someone eyeball thesefiles and see if I missed anything obvious:


At the risk of telling grandmothers how to suck eggs, have you tried
with with different compilers, different systems and/or adding a few
irrelevant (but not optimisable-out) declarations or statements?

That sort of phenomenon is exactly what happens when you trip over a
cache problem - e.g. running out of cache associativity.  It can also
occur because of pipeline drain (e.g. branch misprediction) problems.
Neither of those would be found by eyeballing the code - you would at
least have to eyeball the assembler.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  n...@cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679

Re: [OMPI devel] RFC: [slightly] Optimize Fortran MPI_SEND / MPI_RECV

Reply via email to