rhizoma-atractylodis commented on issue #5101:
URL: https://github.com/apache/inlong/issues/5101#issuecomment-1198096296

   ## Motivation
   Based on my reading of the source code. Currently, the DataProxy SDK side 
selects DataProxy nodes using polling (sending messages in TCP mode) and random 
selection (sending messages in HTTP mode).The polling method is not efficient 
enough, and the random method is not easy to achieve load balancing.
   ## Changes
   Use consistent hashing algorithm instead of the original polling and random
   ## Mechanism Options
   Consistent Hash Algorithm and Virtual Node Mechanism
   [Refer to the article for details on the 
algorithm](https://blog.csdn.net/gonghaiyu/article/details/108375298)
   ## Design
   Based on my reading of the source code.The following are the functions that 
need to be modified:
   - 
org.apache.inlong.sdk.dataproxy.network.ClientMgr.getClientByRoundRobin():This 
function obtains the DataProxy node by polling
   - 
org.apache.inlong.sdk.dataproxy.http.InternalHttpSender.sendMessageWithHostInfo(List<String>
 bodies, String groupId, String streamId, long dt, long timeout, TimeUnit 
timeUnit):This function implements the selection of DataProxy nodes by randomly 
selecting HostInfo
   - Need to update the fields of the DataProxy node class to add information 
about virtual nodes
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to