Thanks for your reply!
I used a non-zero value for min_rnr_timer like this:
qp_attr.min_rnr_timer = /*0*/ IB_RNR_NAK_TIMER_2_56;
And to be simple, I put the sender and the receiver in the same process rather than synchronized tow processes. The data was transferred correctly as proved by printing out the contents of the buffers.
However I got another question: the "number of bytes transferred" of the CQE VAPI_CQE_RQ_SEND_DATA was the same as the number of the Recv request. But the one of the CQE VAPI_CQE_SQ_SEND_DATA was zero, which was not what I expected. What is the matter do you think?
Thanks very much!
Here are the relative codes:
===========print the WC descriptor ========
void print_wc_desc(VAPI_wc_desc_t *wc_desc_p)
{
if (wc_desc_p != NULL) {
printf("status: %d\n", wc_desc_p->status);
printf("id: %lu\n", wc_desc_p->id);
printf("opcode: %s\n", get_cqe_opcode_str(wc_desc_p->opcode));
printf("Num. of bytes transferred: %d\n", wc_desc_p->byte_len);
printf("...");
}
}
======= wait the Send to complete =========
do { poll_cnt++;
res = VAPI_poll_cq(hca_hndl, s_cq_hndl, &wc_desc);
if (res != VAPI_OK && res != VAPI_CQ_EMPTY) {
PRINT_ERR("Poll CQ block failed\n");
VAPIERR(res);
return -1;
}
} while(res == VAPI_CQ_EMPTY && poll_cnt < 10);
if (wc_desc.status != VAPI_SUCCESS) {
PRINT_ERR("Req unsuccess: %s\n", VAPI_wc_status_sym(wc_desc.status));
print_wc_desc(&wc_desc);
return -1;
}
PRINT_TRACE("Req success\n");
print_wc_desc(&wc_desc);
On 2/19/06, Dotan Barak <[EMAIL PROTECTED]> wrote:
I believe that the problem is that the min_rnr_timer value is 0 (which means infinite timeout between the rnr retries) and there is rnr nak between the two sides (because you don't sync between the sides, and this is the reason for the empty CQ …
Let me describe the problem:
The sender sent a send message which should consume RR (Receive Request) at the receiver side, but when the message have reached to the receiver there wasn't any RR in the RQ, so he sent to the sender rnr-nack, the sender got the rnr-nack and is waiting the min_rnr_timer which is infinite …
You should do the following things:
- Put a non zero value in the min_rnr_timer (you may get completion with bad status: rnr exceeded if the receiver won't be ready in time …)
- Post RR in the responder in the init state
- Optional: sync between the sides (post SR at the sender only when there is RR in the receiver side).
Dotan
--
Ian Jiang
[EMAIL PROTECTED]
Laboratory of Spatial Information Technology
Division of System Architecture
Institute of Computing Technology
Chinese Academy of Sciences
_______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
