[
https://issues.apache.org/jira/browse/STORM-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962001#comment-13962001
]
ASF GitHub Bot commented on STORM-166:
--------------------------------------
Github user revans2 commented on a diff in the pull request:
https://github.com/apache/incubator-storm/pull/61#discussion_r11352925
--- Diff: storm-core/src/clj/backtype/storm/daemon/nimbus.clj ---
@@ -894,10 +895,47 @@
)
)
+(defn- sync-storm-code-from-leader [nimbus]
+ (let [conf (:conf nimbus)
+ storm-cluster-state (:storm-cluster-state nimbus)
+ storm-ids (.assignments storm-cluster-state nil)
+ storm-code-map (->> (dofor [sid storm-ids] {sid (.assignment-info
storm-cluster-state sid nil)})
+ (apply merge)
+ (filter-val not-nil?)
+ (map-val :master-code-dir)
+ )
+ downloaded-storm-ids (set (map #(java.net.URLDecoder/decode %)
(read-dir-contents (master-stormdist-root conf))))
+ tmproot (str (master-tmp-dir conf) file-path-separator (uuid))]
+ (doseq [[storm-id master-code-dir] storm-code-map]
+ (when (not (downloaded-storm-ids storm-id))
+ (log-message "Downloading code for storm id " storm-id " from "
master-code-dir)
+
+ (FileUtils/forceMkdir (File. tmproot))
+ (Utils/downloadFromMaster conf (master-stormjar-path
master-code-dir) (master-stormjar-path tmproot))
+ (Utils/downloadFromMaster conf (master-stormcode-path
master-code-dir) (master-stormcode-path tmproot))
+ (Utils/downloadFromMaster conf (master-stormconf-path
master-code-dir) (master-stormconf-path tmproot))
+ (FileUtils/moveDirectory (File. tmproot) (File.
(master-stormdist-root conf storm-id)))
+
+ (log-message "Finished downloading code for storm id " storm-id
" from " master-code-dir)
+ )
+ )
+ )
+)
+
(defserverfn service-handler [conf inimbus]
(.prepare inimbus conf (master-inimbus-dir conf))
(log-message "Starting Nimbus with conf " conf)
- (let [nimbus (nimbus-data conf inimbus)]
+ (let [nimbus (nimbus-data conf inimbus)
+ nimbus-leadership (nimbus-leadership conf)]
--- End diff --
Because nimbus-leadership is opening a new connection to ZK is there ever a
possibility that the nimbus-leadership connection will be lost (networking
glitch) and the other ZK will not be? This could result in two nimbus
instances both running at the same time.
> Highly available Nimbus
> -----------------------
>
> Key: STORM-166
> URL: https://issues.apache.org/jira/browse/STORM-166
> Project: Apache Storm (Incubating)
> Issue Type: New Feature
> Reporter: James Xu
> Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/360
> The goal of this feature is to be able to run multiple Nimbus servers so that
> if one goes down another one will transparently take over. Here's what needs
> to happen to implement this:
> 1. Everything currently stored on local disk on Nimbus needs to be stored in
> a distributed and reliable fashion. A DFS is perfect for this. However, as we
> do not want to make a DFS a mandatory requirement to run Storm, the storage
> of these artifacts should be pluggable (default to local filesystem, but the
> interface should support DFS). You would only be able to run multiple NImbus
> if you use the right storage, and the storage interface chosen should have a
> flag indicating whether it's suitable for HA mode or not. If you choose local
> storage and try to run multiple Nimbus, one of the Nimbus's should fail to
> launch.
> 2. Nimbus's should register themselves in Zookeeper. They should use a leader
> election protocol to decide which one is currently responsible for launching
> and monitoring topologies.
> 3. StormSubmitter should find the Nimbus to connect to via Zookeeper. In case
> the leader changes during submission, it should use a retry protocol to try
> reconnecting to the new leader and attempting submission again.
--
This message was sent by Atlassian JIRA
(v6.2#6252)