jiajunwang commented on pull request #1307:
URL: https://github.com/apache/helix/pull/1307#issuecomment-680243825
> > Seems to be similar to the current Topology.java design. Why we need a
new class?
> > If anything missed in Topology.java, we can add new methods there to
read the tree and return the desired information.
>
> This class is purely to collect the instance configs and return to the
users in the format they want. Helix has no logic involved in this part and we
do not do any calculation. We considered topology, and the conclusion is that
it will make both more complicated for now. Topology is mainly used for
rebalancing purpose, and Helix adds default value if there is anything missing,
and also Topology actually only supports two levels, as shown here:
>
> public enum Types {
> ROOT,
> ZONE,
> INSTANCE
> }
>
> Although other levels keys are inserted in the tree, they're ignored
during further calculation. This guideline is hard coded everywhere through
that class. If we're going to change that, it's basically rewriting it.
> Another big issue is that Topology is tightly coupled with `faultZoneType`
in different logic. But the new API totally ignore that field.
> Although in the long run we may consolidate two classes, the prerequisite
will be topology can support real multiple level (more than 2) hierarchy
without hardcoding anything.
> Trie data structure is more suitable for this use case regarding
simpleness and efficiency. This class will only be used when users query domain
information for the cluster or for any specific domain.
As we discussed, let's define the detailed requirement first and then refine
the java class design.
But to answer some of you comment here, my premature reply would be,
1. I think adding default value logic needs to be delivered to the user as
well. They need to understand how Helix locates an instance. It is a Helix API
eventually. If the rest API tells the user that this node has no definition,
but internally, it is rebalanced as in the default fault zone, then it would be
very confusing.
2. Topology.java supports all possible Domain definition, not only 2 levels.
Although the other layers are not used by the rebalancer, you can get the full
tree structure from Topology.java if simply adding a method that traverses the
tree like you what you are doing in this new class. I'm not aware of the
necessity of rewriting.
3. faultZoneType is purely optional.
4. I agree that the trie data structure is more preferred. But it seems not
a must for this requirement. If it is a normal tree, the get methods can still
be developed, right?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]