yanglimingcn commented on PR #3145:
URL: https://github.com/apache/brpc/pull/3145#issuecomment-3540482500
> We encountered an error: requests were not being sent, causing a large
number of client timeouts.
>
> ```shell
> [E1008]Reached timeout=60000ms @Socket{id=13 fd=1160 addr=xxx:xx}
(0x0x7f957c964ec0) rdma info={rdma_state=ON, handshake_state=ESTABLISHED,
rdma_remote_rq_window_size=63, rdma_sq_window_size=0,
rdma_local_window_capacity=125, rdma_remote_window_capacity=125,
rdma_sbuf_head=57, rdma_sbuf_tail=120, rdma_rbuf_head=36, rdma_unacked_rq_wr=0,
rdma_received_ack=0, rdma_unsolicited_sent=0, rdma_unsignaled_sq_wr=1,
rdma_new_rq_wrs=0, }
> ```
>
> From the RDMA connection information, we found that because
`ibv_req_notify_cq` was only solicited, send WCs did not generate a CQEs.
Without recv CQEs, send WCs could not be polled, so ·_sq_window_size` remained
at 0. This is likely the reason why both the client and server are unable to
send messages.
>
> Using `ibv_req_notify_cq` with `solicited_only=0` could solve this
problem, but it would generate too many events. Therefore, we split the CQ into
`send_cq`(`solicited_only=0`) and `recv_cq`(`solicited_only=1`).
With this modification, send_cq will generate one CQE for every 1/4 of the
window?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]