[ 
https://issues.apache.org/jira/browse/STORM-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959739#comment-13959739
 ] 

ASF GitHub Bot commented on STORM-166:
--------------------------------------

GitHub user yveschina opened a pull request:

    https://github.com/apache/incubator-storm/pull/61

    nimbus ha solution for issue STORM-166

    Nimbus HA feature is quite important for our application running on the 
storm cluster. So, we've been working on the problem for some time and now a 
solution seems not that perfect but be enough to apply has comed out.
    
    1.Nimbus Servers now can register themselves in Zookeeper. They perform a 
leader election using "InterProcessMutex" interact with Zookeeper to ensure 
that there is only one nimbus responsible for launching and monitoring 
topologies.
    
    2.Every Nimbus Server is running a timer to compare and find if there are 
topology codes which are not exists on it's local disk. They would download 
lcoal missing topology codes from the Nimbus leader through the thrift RPC just 
like Supervisors do.With this feature, any numbers of Nimbus Server can be 
launched through out  the cluster.
    
    3.StormSubmitter,Supervisor,Non-leader Nimbus and Storm UI now are able to 
find and connect to the Nimbus leader via Zookeeper.A Nimbus leadership table 
is also added to Storm-UI on the main page to show every Nimbus's 
leader-election state and it's host in addition.
    
    PS: Some implementation of the Nimbus-Election part has taken @Frostman's 
solution for reference(link: https://github.com/nathanmarz/storm/pull/422).


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yveschina/incubator-storm master

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-storm/pull/61.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #61
    
----
commit 4f12373e77e925b82e1903bd2eda31176871834d
Author: YuFeng <[email protected]>
Date:   2014-04-04T07:03:56Z

    nimbus ha solution for issue STORM-166

----


> Highly available Nimbus
> -----------------------
>
>                 Key: STORM-166
>                 URL: https://issues.apache.org/jira/browse/STORM-166
>             Project: Apache Storm (Incubating)
>          Issue Type: New Feature
>            Reporter: James Xu
>            Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/360
> The goal of this feature is to be able to run multiple Nimbus servers so that 
> if one goes down another one will transparently take over. Here's what needs 
> to happen to implement this:
> 1. Everything currently stored on local disk on Nimbus needs to be stored in 
> a distributed and reliable fashion. A DFS is perfect for this. However, as we 
> do not want to make a DFS a mandatory requirement to run Storm, the storage 
> of these artifacts should be pluggable (default to local filesystem, but the 
> interface should support DFS). You would only be able to run multiple NImbus 
> if you use the right storage, and the storage interface chosen should have a 
> flag indicating whether it's suitable for HA mode or not. If you choose local 
> storage and try to run multiple Nimbus, one of the Nimbus's should fail to 
> launch.
> 2. Nimbus's should register themselves in Zookeeper. They should use a leader 
> election protocol to decide which one is currently responsible for launching 
> and monitoring topologies.
> 3. StormSubmitter should find the Nimbus to connect to via Zookeeper. In case 
> the leader changes during submission, it should use a retry protocol to try 
> reconnecting to the new leader and attempting submission again.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to