randomkang commented on PR #3255: URL: https://github.com/apache/brpc/pull/3255#issuecomment-4178795179
> > > Have you tested the ECE logic? > > > > > > 1. Both client and server are enabled. > > > 2. Client is enabled, server is disabled. > > > 3. Client is disabled, server is enabled. > > > 4. Client is disabled, server is disabled. > > > > > > I test 1 and 4, and it works well. ECE is used for the server and client to negotiate the advanced features of the RDMA network card, so 2 and 3 is illegal and i don't test these cases. > > So for cases 2 and 3, we should get an error. Could you try running it with this configuration and see what the result is? I test 2 and 3 on the model training task, and the task does not report any errors. The rdma nic i used is Mellanox Technologies MT2910 Family(ConnectX-7). The ece option for this rdma nic is 0011 0000 0000 0000 0000 0000 0000 0010. In my opinion, only 3 bit of ece option is open, and these 3 advanced rdma features can take effect on one end only. For other rdma nics, 2 and 3 maybe report errors。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
