rhizoma-atractylodis opened a new pull request, #5544:
URL: https://github.com/apache/inlong/pull/5544

   ### Motivation
   Based on my reading of the source code. Currently, the DataProxy SDK side 
selects DataProxy nodes using polling (sending messages in TCP mode) and random 
selection (sending messages in HTTP mode).The polling method is not efficient 
enough, and the random method is not easy to achieve load balancing.
   
   ### Changes
   Use consistent hashing algorithm instead of the original polling and random
   
   ### Mechanism Options
   Consistent Hash Algorithm and Virtual Node Mechanism
   [Refer to the article for details on the 
algorithm](https://blog.csdn.net/gonghaiyu/article/details/108375298)
   
   ### Design
   Based on my reading of the source code.The following are the functions that 
need to be modified:
   
   
org.apache.inlong.sdk.dataproxy.network.ClientMgr.getClientByRoundRobin():This 
function obtains the DataProxy node by polling
   
org.apache.inlong.sdk.dataproxy.http.InternalHttpSender.sendMessageWithHostInfo(List
 bodies, String groupId, String streamId, long dt, long timeout, TimeUnit 
timeUnit):This function implements the selection of DataProxy nodes by randomly 
selecting HostInfo
   Need to update the fields of the DataProxy node class to add information 
about virtual nodes
   The hash ring and virtual nodes need to be completed on the DataProxy side, 
and the strategy for acquiring DataProxy nodes on the SDK side must be updated 
at the same time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to