[
https://issues.apache.org/jira/browse/BOOKKEEPER-508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13586961#comment-13586961
]
Jiannan Wang commented on BOOKKEEPER-508:
-----------------------------------------
Thanks Sijie for the new algorithm, the idea of bunch topic ownership
assignment algorithm by consistent algorithm looks great. My understanding is
that hash plays as cache to reduce unnecessary query operation, and I believe
this might improve a lot. But I have some comments:
* The algorithm may have load balance issue: newly added Hub has same
probability with old Hub to own new topics (original algorithm chooses
least-loaded Hub to be leader). But this might not be a big problem based on
BOOKKEEPER-363, which suggests periodically release topics to balance the load.
* Actually the algorithm implicitly requires Hub release topics periodically
to keep the cache hit rate: once one Hub is not available (crash, zk session
timeout, restart Hub for deployment, etc) or a new Hub added, the hash
algorithm may fail to predict the owner on some topics. And as time passed, the
hash algorithm may rarely hit owner if the ownership does not change. So
periodically release topics is required to keep the cache hit rate, however,
release topic is heavy operation inc current Hedwig.
* In section 3.1, item 3 "round-robin way for easy" may not work well for
this case since it degrades consistent hash algorithm to a simple hash
algorithm. In original consistent hash algorithm, redundant/virtual nodes are
introduced to balance the load, round-robin breaks this idea.
> Better topic assignment algorithm
> ---------------------------------
>
> Key: BOOKKEEPER-508
> URL: https://issues.apache.org/jira/browse/BOOKKEEPER-508
> Project: Bookkeeper
> Issue Type: Sub-task
> Components: hedwig-server
> Reporter: Sijie Guo
> Fix For: 4.3.0
>
> Attachments: topicassignment.pdf
>
>
> currently each hub server just cached its owned topics. for those they don't
> owned topics, hub server has to request metadata store to know the topic
> owner.
> the bad thing is that clients access hub server thru VIP which is
> round-robin, which means there would be lots of missed accessing owner,
> causing lots of traffic to metadata store.
> need to provide a better algorithm to avoid unnecessary metadata traffic.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira