Hi Leonard,

We are currently actively developing a way for the Helix controller to do 
exactly that. Essentially, working with YARN, Mesos, EC2, or another 
provisioner, the Helix controller will look at your service constraints for how 
many machines should be up, and ensure that many containers are always running 
for your service. We are also looking at ways to dynamically monitor CPU and 
memory usage to determine the number of containers to have running. We have a 
working implementation with YARN in the helix-provisioning branch of our source 
code. See http://helix.apache.org/sources.html for instructions on how to 
access the source code.

Historically, Helix's goal has been to take the machines that are active, and 
intelligently distribute your service across those machines. When a service 
goes down, Helix makes sure that the cluster remains in a good state. Without 
provisioning, Helix would simply reallocate your resources without any ability 
to start or stop containers. With provisioning, Helix can now tell the 
provisioner to bring up or take down containers in addition to reassigning.

Kanak
________________________________
> Date: Thu, 6 Mar 2014 13:53:15 -0800 
> Subject: Re: Apache Helix 
> From: [email protected] 
> To: [email protected]; [email protected] 
> 
> + user 
> 
> 
> 
> 
> On Thu, Mar 6, 2014 at 5:24 AM, Leonard Kramer 
> <[email protected]<mailto:[email protected]>>
>  
> wrote: 
> Hi Mr. Kishore, 
> 
> my name is Leonard Kramer and I've found your mail-address in the 
> Apache Helix mailing list. I'm currently evaluating Apache Helix for 
> the cluster-management of ZooKeeper itself. The ultimate goal is to 
> create an autonomous zookeeper-service with its own 
> migration-strategies and self repair functions. 
> 
> For the monitoring and reaction to outages of nodes I want to use 
> Apache Helix, because it already uses ZooKeeper as its primary 
> coordination & data-store. In the mailing-list you have answered a 
> thread for the differences between norbert and helix and mentioned that 
> "Failure: When a node fails, you have multiple options [...] #3 Start a 
> new node and assign the partitions to that new node. [...] #3 feature 
> is work in progress and is possible if the deployment system is 
> flexible and allows starting up process dynamically". 
> While studying Helix I couldn't find any more information regarding 
> this feature. All I could find was the blog post 
> "http://engineering.linkedin.com/cluster-management/auto-scaling-apache-helix-and-apache-yarn";,
>  
> which I think is too complex for my specific use-case. 
> 
> Can you please provide me with more information regarding the self 
> repair of a helix cluster? Is the helix-controller capable of invoking 
> logic for starting additional nodes? My plan is to use Jclouds in 
> addition to the cloud-controller and start a new node when the 
> helix-controller notices a node's failure. Is this possible? 
> 
> Thank you for your help and have nice day. 
> Greetings from Germany 
> 
> Leo 
> 
                                          

Reply via email to