RE: API Review: Helix Admin - Feedback

Kanak Biscuitwala Tue, 25 Feb 2014 11:08:04 -0800

Alright, here's where I think we agree so far:

- Single class for building common rebalancer configs
- Some form of factory for getting a HelixConnection
- Keeping building constructors empty, but validating on build
- Having an admin scoped to a single cluster is probably cleaner


Here's what we still need to discuss:

- The complexity of the helix connection factory:

I'm not sure what benefit to the user (or developer) this buys us. I mean 
theoretically we could reuse open HelixConnections, but I don't think that's 
what you're suggesting. Right now, it seems like more code for us and the user 
to do the same thing.

- How to let applications gracefully shut down:

We provide FINALIZE callbacks to users so they can do what is necessary before 
we shut down.

- The necessity of a multi-cluster admin:

There's only one method that we need to support that is beyond the scope of a 
single cluster: listClusters(). The rest can and should be scoped to a single 
cluster; it's just a question of where we stick this one method.

- The role of an admin in the cluster, and whether or not it should be the 
source of all APIs or if the connection should be:

Just so we're on the same page, let me briefly describe how administration 
typically works in systems that use Helix:

1. There are tools that change the cluster state that will exclusively work 
through an administrator.
2. Participants and spectators don't change the cluster state most of the time, 
but at some point may need to save a config or store a property, and thus need 
administration support.

Generally Helix systems are driven through participant state transition 
callbacks and are generally low-touch with respect to configurations over time. 
I would recommend looking at the other wiki pages first for some context. So I 
disagree that everything should hang off of HelixAdmin.

By the way, being able to share a connection is something that was actually 
requested by a user. There are legitimate reasons for wanting a single live 
connection across multiple roles on a single node.

- Whether or not multiple admins attached to a single connection should be 
legal:

The administrator is passive. It doesn't run things in the background and 
absolutely has nothing to clean up.

Kanak

----------------------------------------
> Date: Tue, 25 Feb 2014 08:24:30 -0800
> Subject: Re: API Review: Helix Admin - Feedback
> From: [email protected]
> To: [email protected]
>
> Actually coming to think of it I am starting to think why even open up
> the HelixConnection? Why cant someone create a HelixAdmin instance
> directly and we manage the connections internally. The lesser the
> objects exposed through API the better.
>
> Thanks,
>
> Sandeep
>
> On Tue, Feb 25, 2014 at 12:27 AM, Sandeep Nayak <[email protected]> wrote:
>> [Sandeep] See responses inline.
>>
>> On Mon, Feb 24, 2014 at 11:05 PM, Kanak Biscuitwala <[email protected]> 
>> wrote:
>>> My responses are inline. Others please feel free to chime in.
>>>
>>> General comments:
>>>
>>> - I think HelixDataAccessor serves as a good basis for an SPI (which I have 
>>> largely ignored in the wiki pages I posted). It exposes leaf nodes as 
>>> HelixProperty instances and is entirely decoupled from ZooKeeper.
>>> - Things I want to avoid: class/interface/code-size explosion and 
>>> overabstraction. I'm still not sure what the right balance is.
>>
>>>
>>> ----------------------------------------
>>>> Date: Mon, 24 Feb 2014 22:02:29 -0800
>>>> Subject: API Review: Helix Admin - Feedback
>>>> From: [email protected]
>>>> To: [email protected]
>>>>
>>>> Hey guys,
>>>>
>>>> Here is my first set of feedback, thought I should start this in
>>>> chunks rather than all in one go, so picking 3 items at a time.
>>>>
>>>> Connection Management
>>>>
>>>> * HelixConnection seems like it belongs to API
>>>
>>> Could you clarify what you mean by "belongs to API?"
>>
>> [Sandeep] So my notion of API is all interfaces/contracts which a
>> Helix user can use safely. API is our contract to Helix users, as long
>> as they stick to the API we make sure the contract is retained. In
>> that regard ZKHelixAdmin should not be in API, it should belong
>> somewhere in a core component. API is what we can hand over to a Helix
>> user and say here, adhere to this and these contracts will be
>> respected in future versions, we will only remove them through a
>> deprecation process and you will get enough time to migrate.
>>
>>>
>>>> * ZKHelixConnection is a specialization specific to Zookeeper
>>>> * We need an factory pattern which allows creating Connections based
>>>> on a provider-type. A provider type could be an enum which has
>>>> Zookeeper to begin with and we can add other types later. The factory
>>>> should be a part of SPI, one implementation of SPI is Zookeeper.
>>>
>>> Do you mean:
>>>
>>> HelixConnection connection = 
>>> HelixConnectionFactory.getConnection(ConnectionType.ZOOKEEPER, zkAddress);
>>>
>>> Can you write example code for this?
>>
>> [Sandeep] This is not API so not visible to the end user, this is
>> internal to Helix core.
>>
>> HelixConnectionFactory becomes an interface which connection providers
>> need to implement. One provider is zookeeper.
>>
>> enum HelixStoreProvider{ //Did not figure a better name, but since we
>> store data in zookeeper thought of it as store
>> ZOOKEEPER,
>>
>> IN_MEMORY; //Just a sample not that someone will write one, but
>> demonstrates multiplicity of providers
>> }
>>
>> interface HelixConnectionFactory{
>> getConnection(String address);
>>
>> String getName(); //you can make the return type an enum if we
>> control all providers added
>> OR
>> HelixStoreProvider getProvider(); //if you use an enum then it
>> would be as such
>>
>> }
>>
>> ZookeeperConnectionFactory implements HelixConnectionFactory{
>> getConnection(String address){
>>
>> }
>>
>> String getName(){
>> return "zookeeper";
>> }
>> OR
>>
>> HelixStoreProvider getProvider(){
>> return HelixStoreProvider.ZOOKEEPER;
>> }
>> }
>>
>> HelixConnectionFactoryRegistry contains all registered connection
>> factories, its a map retained
>> Map<String, HelixConnectionFactory> connectionFactories or
>> Map<HelixStoreProvider, HelixConnectionFactory>
>>
>> We provider a mechanism for connection factories to be registered in
>> the registries, a common model is like the java service discovery
>> model. The registry would load all configured connection factories we
>> place as a comma class names in
>> META-INF/services/org.apache.helix.spi.HelixConnectionfactory. The
>> file contains org.apache.helix.spi.impl.ZookeeperHelixConnectionFactory
>>
>>
>> So in the call would look like such.
>>
>> new 
>> HelixConnectionBuilder.usingProvider(HelixStoreProvider.ZOOKEEPER).toAddress(zkAddress).build();
>>
>>>
>>>> * Should the zkAddress be a String? Should it be a special type like
>>>> HelixProviderURL? Having something like String requires us to document
>>>> it clearly but having a a specialization like a HelixProviderURL
>>>> allows a guided building of such URL. Not sure if its an overkill but
>>>> If we use a provider URL we should allow creating them through a
>>>> builder pattern.
>>>
>>> I would say this is overkill. Each layer of abstraction is more code for 
>>> the users. Most deployments know where ZK is and it's provided to them as a 
>>> string.
>>
>> [Sandeep] Yes, I thought so as much but was not sure if we need to
>> safeguard against invalid urls, seems like its well understood
>> pattern.
>>>
>>>> * We need to declare exceptions when connect fails
>>>
>>> I've ignored failure cases entirely for now.
>>
>> [Sandeep] Lets make sure we dont lose track of exceptions, when we get
>> to some level of completion with the API lets iterate through
>> exceptions.
>>
>>>
>>>> * We need to allow for a time delay for the shutdown, something like a
>>>> future.get(timeout)
>>>>
>>>
>>> I'm not super experienced with this, so could you give an example of why 
>>> this is preferred to alternative methods of preemption?
>>
>> [Sandeep] The reason for this is allow for a shutdown API which does
>> not shutdown immediately but allows the users to trigger shutdown with
>> a grace period. The delay just allows them to ask helix for a grace
>> period incase they need to do some cleanup on their side.
>>
>> The flip side we provide interceptors so they can inject logic before 
>> shutdown.
>>
>>>
>>>>
>>>> Get an Administrator
>>>>
>>>> * HelixAdministrator seems too generic, it should be
>>>> HelixClusterAdministrator given that its an admin for that cluster?
>>>> * We should have a HelixAdministrator which can be an administrator to
>>>> all clusters. How should we retrieve such an administrator?
>>>
>>> I would rather have one administrator that manages all clusters than two 
>>> administrator classes.
>>
>> [Sandeep] I think we should simplify and say 1 admin manages 1
>> cluster, if we say 1 admin manages multiple clusters we then need to
>> identify the cluster in each API when the user adds participant,
>> shutsdown cluster etc. If an admin wants to manage multiple clusters
>> then we should allow disconnect from one cluster and connect to
>> another and manage that one. It just keeps the context clear, if we do
>> 1 to many it just seems confusing.
>>
>>>
>>>> * Having to get a connection first and then getting admin object seems
>>>> inverted. Why not allow
>>>> HelixAdmin.create(HelixProviderURL).getConnection(). It should return
>>>> a specialization of the HelixConnection i.e. HelixAdminConnection. We
>>>> could then hang specific privileges on the HelixAdminConnection and
>>>> not have them on the HelixConnection.
>>>
>>> It's in this direction so that user can share one connection while managing 
>>> different roles. For instance, a single node may be a participant and 
>>> administrator for one cluster, and a spectator for another.
>>
>> [Sandeep] Look at it like this, should a participant have all APIs on
>> a connection that an admin has? But let me take a step back from your
>> examples it seems like a connection can only be used to get an admin
>> and everything else hangs off the admin, could there be other uses of
>> a connection or is it only a means to get an admin handle?
>>
>>>
>>>> * What happens if an administrator is already created? Should we allow
>>>> two administrators?
>>>
>>> I think this is acceptable. I would be more hesitant if a new connection 
>>> was opened for each administrator, but if they're sharing the same 
>>> connection, I don't think it's a big deal.
>>
>> [Sandeep] So what happens if I do this
>> T1 -> Admin 1 using Connection 1 on Thread 1
>> T2 -> Admin 2 using Connection 1 on Thread 2
>> T3 -> Admin 2 closes Connection 1 on Thread 2
>> T3 -> Admin 1 tries to use Connection 1 on Thread 1
>>
>> We would need to track all admin objects and invalidate them when a
>> connection is discarded. Its doable not overly difficult but we are
>> undertaking complexity here.
>>
>>>
>>>>
>>>> Resource Create
>>>>
>>>> * ResourceConfig seems like API
>>>> * Should we go all the way ResourceConfiguration instead of ResourceConfig?
>>>
>>> It means more typing, but I'm fine either way.
>>>
>>>> * ResourceConfigBuilder should have
>>>> ResourceConfig.Builder().withId("string") and the remaining should
>>>> follow like in the example
>>>
>>> Sure. We'll need to validate on build() that an id was provided.
>>>
>>>> * RebalancerConfig seems like API
>>>> * Can we figure RebalancerConfig with types? So we dont have three
>>>> versions we leverage a type to generate/assemble a builder?
>>>
>>> I don't follow, can you give an example?
>>>
>>> We want a RebalancerConfig to be an arbitrary string-serializable class. We 
>>> can provide provide three special RebalancerConfigs for the three built-in 
>>> rebalancing modes. These (and their builders) all derive from 
>>> PartitionedRebalancerConfig.
>>
>> [Sandeep] So I was thinking something like
>> RebalancerConfig.Builder(resourceId).usingStrategy(RebalancerStrategy.FullAuto)....
>>
>> RebalancerStrategy then becomes an enum which internally allows the
>> builder to add a pre-baked recipe for FullAuto strategy with defaults,
>> the defaults can be overridden by the user ofcourse if necessary. But
>> imagine how simple it would be for a majority of new users who could
>> do with baked in recipes while they onboard.
>>
>>>
>>>> * How do we determine if the resource is already added? That should
>>>> dictate the exception.
>>>
>>> A resource is already added if its IdealState is already added.
>>>
>>>>
>>>> I think I will pause and wait for some responses/feedback on how this
>>>> medium of feedback feels before I move forward.
>>>>
>>>> Thanks,
>>>>
>>>> Sandeep
>>>

RE: API Review: Helix Admin - Feedback

Reply via email to