On Tuesday 07 April 2009, Eugene Loh wrote: > Iain Bason wrote: > > But maybe Steve should try 1.3.2 instead? Does that have your > > improvements in it? > > 1.3.2 has the single-queue implementation and automatic sizing of the sm > mmap file, both intended to fix problems at large np. At np=2, you > shouldn't expect to see much difference. > > >> And the slowdown doesn't seem to be observed by anyone other than > >> Steve and his colleague? > > > > It would be useful to know who else has compared these two revisions. > > I just ran Netpipe and found that it gave a comparable sm latency as > other pingpong tests. So, in my mind, the question is why Steve sees > latencies that are about 10 usec on a platform that can give 1 usec. > There seems to be something tricky about reproducing that 10-usec > slowdown. I have trouble buying that it's just, "sm latency degraded > from 1 usec to 10 usec when we went from 1.2 to 1.3". If it were as > simple as that, we would all have been aware of the performance > regression. There is some other special ingredient here (other than > OMPI rev) that we're missing.
<wild guess> Maybe it's not btl layer related at all. Could be something completely different like maybe messed up processor affinity. </wild guess> /Peter
signature.asc
Description: This is a digitally signed message part.