Re: Dynamically configuring instances

Vinayak Borkar Tue, 26 Feb 2013 09:48:37 -0800

Hi Kishore,

Thanks for creating the JIRA. I will try to respond to this mail here,but please let me know if you would like to continue further discussionon the issue in the JIRA going forward.

The reasoning behind having a consistent naming scheme is to provide a
consistent mechanism of assigning partition to nodes even after restarts.
This is important for stateful systems where we dont want to move the data

I see the need for stability in naming instances to a avoid completereshuffle on cluster restart. However, IMO this is a consequence ofHelix's design of having ZooKeeper be the single source of truth whenthe cluster is not running.


Let's say Helix had an alternate approach:

While the cluster is running, let's say that ZooKeeper is used as thesource of truth regarding locations of partitions of resources. On theother hand, when the cluster starts up, say ZK starts with a clean slatethat is incrementally populated as instances join the cluster based onpartitions reported by each instance during the "join" process. Afterthis point say Helix continued doing what it does today.

With this approach, instance names matter only while the cluster isrunning and has no stability requirements across restarts. However, thisis a huge change for Helix and I am sure you guys probably thought aboutthis as a possible direction - I would like to hear your thoughts onthis topic.

on restarts. Another (not really technical but more practical) reason is to
avoid rogue instances connecting to the cluster with random id due to code
bugs or misconfiguration.

I completely agree with the need to handle the rogue/misconfiguredinstances case.


This requirement has come up multiple times at LinkedIn and on other
threads. Will a feature  like auto create instance on join and delete on
leave be help ful. We can have this flag set at cluster level when the
cluster is created so we can throw exception if the flag is set is false
and node is not already created.

While the above feature would be great for adding new instances withlittle configuration (and for zero-configuration while testing), therestill needs a way to handle a loaded cluster restart without leading toa massive reshuffle.



Thanks,
Vinayak

Re: Dynamically configuring instances

Reply via email to