Hi Tzachi, Thanks for the quick response!
Tzachi Dar wrote on Mon, 18 Oct 2010 at 15:28:53 > Hi Fab, > > I'm sorry that I don't have time to look at this thoroughly this week > (maybe someone else will). > > In any case, this looks to me like "End of queue" for IB. > > What this means is that if there are no receive wqes, the card sends > nacks and stops traffic. (probably not your case) This shouldn't be the case since I don't send until I receive, and I always repost before sending. > When there is only one receive packet IB also does not work efficiency, > and fw flow is being used. (this is probably what you see ). Can you describe what fw flow is? Since I won't send the next message until I receive a response, any ACK should get piggy backed on the response, shouldn't it? > This does not explain everything that you see, but it probably explains > the first 3 lines. Why the first 3? Why not just the first one? The other cases have more than 1 receive posted... > By the way, can you post more receive packets and see if this helps? I post as many items as I have space for in my RQ. So an RQ of 8 would have 8 receives posted. I have not looked to see what happens if I post fewer than the limit of the RQ. Thanks, -Fab > Thanks > Tzachi > >> -----Original Message----- >> From: Fab Tillier [mailto:[email protected]] >> Sent: Monday, October 18, 2010 10:07 PM >> To: Tzachi Dar; Leonid Keller >> Cc: [email protected] >> Subject: receive queue depth effect on pingpong latency >> >> Hi Tzachi, Leo, >> >> I've been playing with the ndpingpong test case, and noticed some >> strange/unexpected behavior: >> >> What I see is a relationship between the RQ depth and the latency where >> the larger the RQ depth, the lower the latency. This, despite the >> program performing a pingpong: a send is only issued once a receive is >> completed, so there should only be a single work request in transit at >> a time. >> >> I changed the unit test to use an asymmetric queue depth, keeping the >> SQ depth at 1, and varying only the RQ depth. >> >> Here are the reported results for the 1 byte message size, by RQ depth: >> >> RQ 1: 7.44us >> RQ 2: 4.76us >> RQ 4: 3.20us >> RQ 6 (default): 2.75us >> RQ 8: 2.44us >> RQ 16: 2.04us >> RQ 32: 1.85us >> RQ 64: 1.76us >> RQ 128: 1.71us >> RQ 256: 1.71us >> RQ 512: 1.68us >> RQ 1024: 1.68us >> RQ 2048: 1.67us >> RQ 4096: 1.67us >> >> As you can see, things are reaching steady state as the queue depth >> gets very large. But as this is a ping pong test, I would have >> expected performance to be much closer to this for the smaller message >> sizes. >> >> This is with ConnectX2, QDR, FW 2.07.9110 >> >> Any idea why the low RQ depth tests perform so poorly? >> >> Thanks, >> -Fab > _______________________________________________ ofw mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw
