tbago opened a new issue #1401: URL: https://github.com/apache/incubator-brpc/issues/1401
We have multi rdma NIC in server. If we use the first rdma device by use **--rdma_device=mlx5_0**. The brpc client and server works well. But if we use other rdma device like **--rdma_device=mlx5_1**, the client cannot communicate to the server. It will show the following error: <img width="827" alt="WechatIMG24" src="https://user-images.githubusercontent.com/12290248/117993016-efc0d300-b371-11eb-95f3-f8a3d18d4c20.png"> The [rdma_create_qp](https://github.com/apache/incubator-brpc/blob/rdma/src/brpc/rdma/rdma_communication_manager.cpp#L311) will failed with errno=22 (invalid param). The root cause is that the GetRdmaProtectionDomain() is use the global context, I mean that the global context may support support well or have some bugs? I have write my custom client and use rdma_cm_id->verbs as context. Then I can connect the brpc server with other rdma device. But in the [rdma_helper.cpp](https://github.com/apache/incubator-brpc/blob/rdma/src/brpc/rdma/rdma_helper.cpp#L545) the pd is init before the rdma_cm_id. So my question is does someone have the same issue will me. And how to fixed the bug, do we need the create the pd after the rdma_cm_id is created. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
