[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13586961#comment-13586961
 ] 

Jiannan Wang commented on BOOKKEEPER-508:
-----------------------------------------

Thanks Sijie for the new algorithm, the idea of bunch topic ownership 
assignment algorithm by consistent algorithm looks great. My understanding is 
that hash plays as cache to reduce unnecessary query operation, and I believe 
this might improve a lot. But I have some comments:
   * The algorithm may have load balance issue: newly added Hub has same 
probability with old Hub to own new topics (original algorithm chooses 
least-loaded Hub to be leader). But this might not be a big problem based on 
BOOKKEEPER-363, which suggests periodically release topics to balance the load.
   * Actually the algorithm implicitly requires Hub release topics periodically 
to keep the cache hit rate: once one Hub is not available (crash, zk session 
timeout, restart Hub for deployment, etc) or a new Hub added, the hash 
algorithm may fail to predict the owner on some topics. And as time passed, the 
hash algorithm may rarely hit owner if the ownership does not change. So 
periodically release topics is required to keep the cache hit rate, however, 
release topic is heavy operation inc current Hedwig.
   * In section 3.1, item 3 "round-robin way for easy" may not work well for 
this case since it degrades consistent hash algorithm to a simple hash 
algorithm. In original consistent hash algorithm, redundant/virtual nodes are 
introduced to balance the load, round-robin breaks this idea.
                
> Better topic assignment algorithm
> ---------------------------------
>
>                 Key: BOOKKEEPER-508
>                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-508
>             Project: Bookkeeper
>          Issue Type: Sub-task
>          Components: hedwig-server
>            Reporter: Sijie Guo
>             Fix For: 4.3.0
>
>         Attachments: topicassignment.pdf
>
>
> currently each hub server just cached its owned topics. for those they don't 
> owned topics, hub server has to request metadata store to know the topic 
> owner.
> the bad thing is that clients access hub server thru VIP which is 
> round-robin, which means there would be lots of missed accessing owner, 
> causing lots of traffic to metadata store.
> need to provide a better algorithm to avoid unnecessary metadata traffic.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to