[GitHub] [helix] jiajunwang commented on pull request #1307: Add TrieClusterTopology for retrieving hierarchical topology

GitBox Tue, 25 Aug 2020 13:09:24 -0700


jiajunwang commented on pull request #1307:
URL: https://github.com/apache/helix/pull/1307#issuecomment-680243825



   > > Seems to be similar to the current Topology.java design. Why we need a 
new class?
   > > If anything missed in Topology.java, we can add new methods there to 
read the tree and return the desired information.
   > 
   > This class is purely to collect the instance configs and return to the 
users in the format they want. Helix has no logic involved in this part and we 
do not do any calculation. We considered topology, and the conclusion is that 
it will make both more complicated for now. Topology is mainly used for 
rebalancing purpose, and Helix adds default value if there is anything missing, 
and also Topology actually only supports two levels, as shown here:
   > 
   > public enum Types {
   > ROOT,
   > ZONE,
   > INSTANCE
   > }
   > 
   > Although other levels keys are inserted in the tree, they're ignored 
during further calculation. This guideline is hard coded everywhere through 
that class. If we're going to change that, it's basically rewriting it.
   > Another big issue is that Topology is tightly coupled with `faultZoneType` 
in different logic. But the new API totally ignore that field.
   > Although in the long run we may consolidate two classes, the prerequisite 
will be topology can support real multiple level (more than 2) hierarchy 
without hardcoding anything.
   > Trie data structure is more suitable for this use case regarding 
simpleness and efficiency. This class will only be used when users query domain 
information for the cluster or for any specific domain.
   
   As we discussed, let's define the detailed requirement first and then refine 
the java class design.
   But to answer some of you comment here, my premature reply would be,
   1. I think adding default value logic needs to be delivered to the user as 
well. They need to understand how Helix locates an instance. It is a Helix API 
eventually. If the rest API tells the user that this node has no definition, 
but internally, it is rebalanced as in the default fault zone, then it would be 
very confusing.
   2. Topology.java supports all possible Domain definition, not only 2 levels. 
Although the other layers are not used by the rebalancer, you can get the full 
tree structure from Topology.java if simply adding a method that traverses the 
tree like you what you are doing in this new class. I'm not aware of the 
necessity of rewriting.
   3. faultZoneType is purely optional.
   4. I agree that the trie data structure is more preferred. But it seems not 
a must for this requirement. If it is a normal tree, the get methods can still 
be developed, right?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [helix] jiajunwang commented on pull request #1307: Add TrieClusterTopology for retrieving hierarchical topology

Reply via email to