On Tuesday 07 April 2009, Eugene Loh wrote:
> Iain Bason wrote:
> > But maybe Steve should try 1.3.2 instead?  Does that have your
> > improvements in it?
>
> 1.3.2 has the single-queue implementation and automatic sizing of the sm
> mmap file, both intended to fix problems at large np.  At np=2, you
> shouldn't expect to see much difference.
>
> >> And the slowdown doesn't seem to be observed by anyone other than
> >> Steve and his colleague?
> >
> > It would be useful to know who else has compared these two revisions.
>
> I just ran Netpipe and found that it gave a comparable sm latency as
> other pingpong tests.  So, in my mind, the question is why Steve sees
> latencies that are about 10 usec on a platform that can give 1 usec.
> There seems to be something tricky about reproducing that 10-usec
> slowdown.  I have trouble buying that it's just, "sm latency degraded
> from 1 usec to 10 usec when we went from 1.2 to 1.3".  If it were as
> simple as that, we would all have been aware of the performance
> regression.  There is some other special ingredient here (other than
> OMPI rev) that we're missing.

<wild guess>
Maybe it's not btl layer related at all. Could be something completely 
different like maybe messed up processor affinity.
</wild guess>

/Peter

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to