[
https://issues.apache.org/jira/browse/STORM-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14118645#comment-14118645
]
Parth Brahmbhatt commented on STORM-166:
----------------------------------------
I see 2 issues with current implementation:
1) The assumption in the code that once acquire() succeeds nimbus is leader as
long as the process is up. I don't think this is a valid assumption and I would
like to move to either one of the following two models :
Option 1 : Before performing any leader like operation the code checks if it
still has the leadership lock. This will not be a zk call but an internal in
memory state. All tasks that requires leadership checks must finish in
min(zkSessionTimeout, zkConnectionTimeout) as that represents nimbus lock's
leasing period. We need to implement stateListeners for zookeeper that can
update the actual leadership state based on connection and session states.
Options 2: We don't check for leadership at each stage but still implement
stateListeners and just restart nimbus anytime we get a zk connection /session
loss which does not resolve itself in the leasing timeout period.
2) As @revans2 pointed out if nimbus schedules a topology before a backup can
download the code and then nimbus dies then that topology will not be scheduled
after failover. Again multiple options:
Option 1: Use a distributed file system implementation as mentioned in the
jira description.
Option 2: Do not return from the submitTopology until "n" backups also have
the topology copy. This means any new nimbus leader contender will only start
contending for a leader lock once it is completely caught up and any new leader
will have to validate it is actually caught up before it can accept any
leadership task. This option requires more state management and it is harder to
test for correctness.
Option 3: Live with the issue and hope that bittorrent will solve this.
My favourite is option 1.
Let me know if there are any other issues/options that needs to be covered.
> Highly available Nimbus
> -----------------------
>
> Key: STORM-166
> URL: https://issues.apache.org/jira/browse/STORM-166
> Project: Apache Storm (Incubating)
> Issue Type: New Feature
> Reporter: James Xu
> Assignee: Parth Brahmbhatt
> Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/360
> The goal of this feature is to be able to run multiple Nimbus servers so that
> if one goes down another one will transparently take over. Here's what needs
> to happen to implement this:
> 1. Everything currently stored on local disk on Nimbus needs to be stored in
> a distributed and reliable fashion. A DFS is perfect for this. However, as we
> do not want to make a DFS a mandatory requirement to run Storm, the storage
> of these artifacts should be pluggable (default to local filesystem, but the
> interface should support DFS). You would only be able to run multiple NImbus
> if you use the right storage, and the storage interface chosen should have a
> flag indicating whether it's suitable for HA mode or not. If you choose local
> storage and try to run multiple Nimbus, one of the Nimbus's should fail to
> launch.
> 2. Nimbus's should register themselves in Zookeeper. They should use a leader
> election protocol to decide which one is currently responsible for launching
> and monitoring topologies.
> 3. StormSubmitter should find the Nimbus to connect to via Zookeeper. In case
> the leader changes during submission, it should use a retry protocol to try
> reconnecting to the new leader and attempting submission again.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)