Philip Frey1 wrote:

You are right. Thanks!

I have yet another issue:

Sometimes I get the following message in /var/log/messages of the local host:

post_qp_event - AE qpid 0x4e0 opcode 3 status 0x13 type 0 wrid.hi 0x0 wrid.lo 0x65000000

I was looking for the status and opcode in the source and found that
opcode 3 means T3_SEND and status 0x13 means TPT_ERR_OUT_OF_RQE.
At the remote host I get and opcode 7 (T3_TERMINATE) and status 0x0 (SUCCESS).

Clearly there is someone running out of Receive Queue Elements. The error occurred when
doing an ibv_post_send() at the local host. Is this a coincidence or does the local host
somehow know that there are not enough RQE's available at the remote host? In other words,
does the TPT_ERR_OUT_OF_RQE refer to the local or to the remote receive queue?


You have to consider the type too. type 0 indicates ingress errors, and type 1 indicates egress.

So the host that logged opcode 3, status 0x13, type 0 received an incoming SEND but there were no RECV's posted at that time.  The result is a connection termination, which results in the TERMINATE event on the peer side.

Steve.
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to