[
https://issues.apache.org/jira/browse/STORM-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13965453#comment-13965453
]
ASF GitHub Bot commented on STORM-166:
--------------------------------------
Github user ptgoetz commented on the pull request:
https://github.com/apache/incubator-storm/pull/61#issuecomment-40100129
As @revans2 alluded to, some of the challenges with code distribution
(supervisor downloads from nimbus) will be alleviated by using bittorrent for
topology distribution.
I have a branch that switches to using bittorrent for code distribution,
but I've held off on submitting a pull request because there's a bug with
multi-lang topologies that I haven't had time to track down yet (the resource
directory gets deleted).
I'll submit the pull request for reference with the caveat that it
shouldn't be merged until that bug is fixed.
Here's the original pull request:
https://github.com/nathanmarz/storm/pull/629
> Highly available Nimbus
> -----------------------
>
> Key: STORM-166
> URL: https://issues.apache.org/jira/browse/STORM-166
> Project: Apache Storm (Incubating)
> Issue Type: New Feature
> Reporter: James Xu
> Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/360
> The goal of this feature is to be able to run multiple Nimbus servers so that
> if one goes down another one will transparently take over. Here's what needs
> to happen to implement this:
> 1. Everything currently stored on local disk on Nimbus needs to be stored in
> a distributed and reliable fashion. A DFS is perfect for this. However, as we
> do not want to make a DFS a mandatory requirement to run Storm, the storage
> of these artifacts should be pluggable (default to local filesystem, but the
> interface should support DFS). You would only be able to run multiple NImbus
> if you use the right storage, and the storage interface chosen should have a
> flag indicating whether it's suitable for HA mode or not. If you choose local
> storage and try to run multiple Nimbus, one of the Nimbus's should fail to
> launch.
> 2. Nimbus's should register themselves in Zookeeper. They should use a leader
> election protocol to decide which one is currently responsible for launching
> and monitoring topologies.
> 3. StormSubmitter should find the Nimbus to connect to via Zookeeper. In case
> the leader changes during submission, it should use a retry protocol to try
> reconnecting to the new leader and attempting submission again.
--
This message was sent by Atlassian JIRA
(v6.2#6252)