Quoting r. Grant Grundler <[EMAIL PROTECTED]>: > Subject: [PATCH] rdma_lat-09 and results > > Michael, > > Good news: > My next cleanup of rdma_lat.c is working and patch is appended. > Summary of changes below. > > Bad News: > perf is about ~15 cycles slower since the last time I tested. > (Hrm...maybe it's time to cycle power on the TS90 switch again.) > > > Here's with the new rdma_lat.c: > [EMAIL PROTECTED]:/usr/src/openib_gen2/src/userspace/perftest$ ./rdma_lat -C > local address: LID 0x27 QPN 0x80406 PSN 0x9188f7 RKey 0x300434 VAddr > 0x6000000000014001 > remote address: LID 0x25 QPN 0x70406 PSN 0x5d4824 RKey 0x2a0434 VAddr > 0x6000000000014001 > Latency typical: 7140 cycles > Latency best : 6915 cycles > Latency worst : 52915.5 cycles > [EMAIL PROTECTED]:/usr/src/openib_gen2/src/userspace/perftest$ > > And the "client" side: > [EMAIL PROTECTED]:/usr/src/openib_gen2/src/userspace/perftest$ ./rdma_lat -C > 10.0.0.51 > local address: LID 0x25 QPN 0x70406 PSN 0x5d4824 RKey 0x2a0434 VAddr > 0x6000000000014001 > remote address: LID 0x27 QPN 0x80406 PSN 0x9188f7 RKey 0x300434 VAddr > 0x6000000000014001 > Latency typical: 7140 cycles > Latency best : 6907 cycles > Latency worst : 94920 cycles > > > The previous set of rdma_lat results are here: > http://openib.org/pipermail/openib-general/2005-May/006721.html > > I'll guess the previous SVN verion was no older than r2229. > > > I get 7140 to 7151 for the original rdma_lat. Usually 7147.5. > I get 7132 to 7155 with my version of rdma_lat. Usually 7140. > No statistically significant differences. > Both essentially agree on the higher result. > Using "-n 10000" gave more consistent results *
I changed the timestamping strategy. I used to: post tstamp poll post tstamp poll post tstamp poll post tstamp poll This meant that tstamp instruction was out of the data path, while we did polling. On the negative side, although the average (and likely median) delta between tstamps was a reliable measurement of round trip time (since there was one tstamp each roundtrip), min/max values were not measuring anything reliably: if I start polling late, two tstamps can be closer than what the wire allows for. So I changed that to: tstamp post poll tstamp post poll tstamp post poll tstamp post poll And now, on the plus side, the mix/max deltas are actually pessimistic about roundtrip times, on the minus side, we are calling tstamp on detapath, slowing it down. ~15 cycles is a bit high: of course tstamp needs to prevent instructions from being reordered across it, and so it should take on the order of the pipeline depth to perform, but then maybe its a microcode thing. I'm not against going back to the previous measurement, but we'd have to give up the min/max reporting since its an artefact. What do you say? -- MST _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
