|
Roland, I am
trying to write a user level application that receives multicast UD packets at
user level. I am seeing about 1-2 % packet loss between the send side and the
receive side apparently independent of the packet rate for low rates. (Heavily
traced sends and receives with very low rates still drop packets even though
there are more packets posted on the receive side than are sent.) I have a
couple of questions: 1.
Are there any race issues with ibv_get_cq_event? The
example code (ud_pingpong) seems to imply that the correct sequence is Start: Call
ibv_get_cq_event Call
ibv_ack_cq_event <-
anywhere so long as it happens before destroy_cq Call
ibv_req_notify_cq Call
ibv_poll_cq <-
just once not as usual until empty according to the example Goto
start In the old days we called request notify and poll
until poll was empty on a notify thread in order to prevent a race. 2.
When I post say 500 receive buffers and send say 200 send
buffers and tag the sends with a sequence number I often see one or two missing
sequence numbers at the receive side at the poll_cq interface having checked at
the post and poll interfaces of the send side to see that all the correct
sequence numbers went out. I am not sure how this can be possible regardless of
the notification scheme used. I
would love for this to be a programming error in my code but I can’t
figure out how I can mess it up between post_send and poll_cq on the receive
side. I see the same behavior between systems and with a loopback between two
ports on the same HCA. Please
let me know if this rings any bells. Bob
Pearson |
_______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
