On 12/15/2010 07:39 PM, Jeff Squyres wrote:
On Nov 30, 2010, at 4:09 PM, Ioannis Papadopoulos wrote:
The overall time may be the same, however it is alarming (at least to me) that
if you call MPI_Test() too many times, the average time per MPI_Test() call
increases. After all, that is what I am trying to measure, how much it costs to
call MPI_Test() on average.
In your MPI_Wtime() example, the average overhead of MPI_Wtime() is exactly the
same, independently of max/min time - which is what I would expect. This is not
true for MPI_Test(). A small delay before calling the later, lowers the
MPI_Test() average time.
There is a difference between MPI_Test() and MPI_Wtime() -- wtime just calls
gettimeofday() and doesn't do anything else. MPI_Test() trips the progression
engine and therefore may do a variable number of things, some of which may
involve I/O and/or memory allocation. Those are variable time tasks, depending
on all kinds of factors on your system (as Eugene alluded to). This is doubly
true if you're seeing it with two different MPI implementations.
So yes, there might be those small "spikes" that you're seeing (to be honest, I hesitate
to use the word "spike" when dealing with such small numbers in TCP traffic). And they
could be due to a lot of different things, many of which are beyond OMPI's control.
Have you seen if this has any impact on your actual application performance?
I agree that MPI_Test() has to do some progress, but as you can see I am
only sending one message and I busy wait on it - since there is nothing
else to do and no other incoming traffic, I would expect no difference
among MPI_Test() calls, apart from the last one (the one that will
notify me that my message arrived).
From the first set of benchmarks, you can clearly see that the average
time for an MPI_Test() call has decreased by 2 orders of magnitude with
that naive timer. My application is a framework that relies heavily on
non-blocking primitives so I overuse MPI_Test(). The initial message was
sent because I noticed this specific quirk of MPI_Test() when I was
trying to figure out how much overhead my framework puts on top of MPI -
having a really small piece of work just before MPI_Test() has given 1)
lower and 2) more consistent times (minimal fluctuations).