Hi, packet_life_time is 12 in sm.conf.
I'm debugging the polling process and see that ibv_poll_cq returns error when mlx4_cqe->owner_sr_opcode gets MLX4_CQE_OPCODE_ERROR value (in libmlx4, mlx4_poll_one). Where this value came from? Thanks, Vlad On Mon, Oct 31, 2011 at 3:00 PM, Or Gerlitz <[email protected]> wrote: > On Mon, Oct 31, 2011 at 9:08 AM, Vlad Weinbaum > <[email protected]> wrote: >> [...] I found detail that I cannot explain. I query the QP after connect and >> get timeout value 16, >> that must be 4 us * 2^16 = 256 ms, but I get about 800 ms. > > As Sean indicated, the timeout is **based** on the packet_life_time, > in case you're configuring > your QP through the rdma-cm, the IB stack code actually adds one to > the packet_life_time quantity > as a rough estimate for the local hca ack delay, still maybe there is > a hole here and the query qp verb > isn't reporting correctly, what was the value you configured on the sm > side, and how many retries did > you use for the qp? each retry will double the time you should be > observing in practice. > > Or. > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
