Hi Tzachi, Leo, I've been playing with the ndpingpong test case, and noticed some strange/unexpected behavior:
What I see is a relationship between the RQ depth and the latency where the larger the RQ depth, the lower the latency. This, despite the program performing a pingpong: a send is only issued once a receive is completed, so there should only be a single work request in transit at a time. I changed the unit test to use an asymmetric queue depth, keeping the SQ depth at 1, and varying only the RQ depth. Here are the reported results for the 1 byte message size, by RQ depth: RQ 1: 7.44us RQ 2: 4.76us RQ 4: 3.20us RQ 6 (default): 2.75us RQ 8: 2.44us RQ 16: 2.04us RQ 32: 1.85us RQ 64: 1.76us RQ 128: 1.71us RQ 256: 1.71us RQ 512: 1.68us RQ 1024: 1.68us RQ 2048: 1.67us RQ 4096: 1.67us As you can see, things are reaching steady state as the queue depth gets very large. But as this is a ping pong test, I would have expected performance to be much closer to this for the smaller message sizes. This is with ConnectX2, QDR, FW 2.07.9110 Any idea why the low RQ depth tests perform so poorly? Thanks, -Fab _______________________________________________ ofw mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
