chenBright commented on PR #3145:
URL: https://github.com/apache/brpc/pull/3145#issuecomment-3540115583
We encountered an error: requests were not being sent, causing a large
number of client timeouts.
```shell
[E1008]Reached timeout=60000ms @Socket{id=13 fd=1160 addr=xxx:xx}
(0x0x7f957c964ec0) rdma info={rdma_state=ON, handshake_state=ESTABLISHED,
rdma_remote_rq_window_size=63, rdma_sq_window_size=0,
rdma_local_window_capacity=125, rdma_remote_window_capacity=125,
rdma_sbuf_head=57, rdma_sbuf_tail=120, rdma_rbuf_head=36, rdma_unacked_rq_wr=0,
rdma_received_ack=0, rdma_unsolicited_sent=0, rdma_unsignaled_sq_wr=1,
rdma_new_rq_wrs=0, }
```
From the RDMA connection information, we found that because
`ibv_req_notify_cq` was only solicited, send WCs did not generate a CQEs.
Without recv CQEs, send WCs could not be polled, so ยท_sq_window_size` remained
at 0. This is likely the reason why both the client and server are unable to
send messages.
Using `ibv_req_notify_cq` with `solicited_only=0` could solve this problem,
but it would generate too many events. Therefore, we split the CQ into
`send_cq`(`solicited_only=0`) and `recv_cq`(`solicited_only=1`).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]