tbago opened a new issue #1401:
URL: https://github.com/apache/incubator-brpc/issues/1401


   We have multi rdma NIC in server. If we use the first rdma device by use 
**--rdma_device=mlx5_0**. The brpc client and server works well. But if we use 
other rdma device like **--rdma_device=mlx5_1**, the client cannot communicate 
to the server. It will show the following error:
   <img width="827" alt="WechatIMG24" 
src="https://user-images.githubusercontent.com/12290248/117993016-efc0d300-b371-11eb-95f3-f8a3d18d4c20.png";>
   
   The 
[rdma_create_qp](https://github.com/apache/incubator-brpc/blob/rdma/src/brpc/rdma/rdma_communication_manager.cpp#L311)
  will failed with errno=22 (invalid param). The root cause is that the 
GetRdmaProtectionDomain() is use the global context, I mean that the global 
context may support support well or have some bugs? I have write my custom 
client and use rdma_cm_id->verbs as context. Then I can connect the brpc server 
with other rdma device.
   
       But in the 
[rdma_helper.cpp](https://github.com/apache/incubator-brpc/blob/rdma/src/brpc/rdma/rdma_helper.cpp#L545)
 
   the pd is init before the rdma_cm_id. 
   
       So my question is does someone have the same issue will me.  And how to 
fixed the bug, do we need the create the pd after the rdma_cm_id is created.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to