Philip Frey1 wrote:
You are right. Thanks!
I have yet another issue:
Sometimes I get the following message in
/var/log/messages
of the local host:
post_qp_event - AE qpid 0x4e0 opcode 3 status 0x13
type 0 wrid.hi 0x0 wrid.lo 0x65000000
I was looking for the status and opcode in the
source
and found that
opcode 3 means T3_SEND and status 0x13 means
TPT_ERR_OUT_OF_RQE.
At the remote host I get and opcode 7
(T3_TERMINATE)
and status 0x0 (SUCCESS).
Clearly there is someone running out of Receive
Queue
Elements. The error occurred when
doing an ibv_post_send() at the local host. Is
this
a coincidence or does the local host
somehow know that there are not enough RQE's
available
at the remote host? In other words,
does the TPT_ERR_OUT_OF_RQE refer to the local or
to the remote receive queue?
You have to consider the type too. type 0 indicates ingress errors, and
type 1 indicates egress.
So the host that logged opcode 3, status 0x13, type 0 received an
incoming SEND but there were no RECV's posted at that time. The result
is a connection termination, which results in the TERMINATE event on
the peer side.
Steve.
|
_______________________________________________
general mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general