[ 
https://issues.apache.org/jira/browse/STORM-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962001#comment-13962001
 ] 

ASF GitHub Bot commented on STORM-166:
--------------------------------------

Github user revans2 commented on a diff in the pull request:

    https://github.com/apache/incubator-storm/pull/61#discussion_r11352925
  
    --- Diff: storm-core/src/clj/backtype/storm/daemon/nimbus.clj ---
    @@ -894,10 +895,47 @@
       )
     )
     
    +(defn- sync-storm-code-from-leader [nimbus]
    +  (let [conf (:conf nimbus)
    +        storm-cluster-state (:storm-cluster-state nimbus)
    +        storm-ids (.assignments storm-cluster-state nil)
    +        storm-code-map (->> (dofor [sid storm-ids] {sid (.assignment-info 
storm-cluster-state sid nil)})
    +                            (apply merge)
    +                            (filter-val not-nil?)
    +                            (map-val :master-code-dir)
    +                            )
    +        downloaded-storm-ids (set (map #(java.net.URLDecoder/decode %) 
(read-dir-contents (master-stormdist-root conf))))
    +        tmproot (str (master-tmp-dir conf) file-path-separator (uuid))]
    +    (doseq [[storm-id master-code-dir] storm-code-map]
    +        (when (not (downloaded-storm-ids storm-id))
    +          (log-message "Downloading code for storm id " storm-id " from " 
master-code-dir)
    +          
    +          (FileUtils/forceMkdir (File. tmproot))
    +          (Utils/downloadFromMaster conf (master-stormjar-path 
master-code-dir) (master-stormjar-path tmproot))
    +          (Utils/downloadFromMaster conf (master-stormcode-path 
master-code-dir) (master-stormcode-path tmproot))
    +          (Utils/downloadFromMaster conf (master-stormconf-path 
master-code-dir) (master-stormconf-path tmproot))
    +          (FileUtils/moveDirectory (File. tmproot) (File. 
(master-stormdist-root conf storm-id)))
    +          
    +          (log-message "Finished downloading code for storm id " storm-id 
" from " master-code-dir)
    +         )
    +     )
    +  )
    +)
    +
     (defserverfn service-handler [conf inimbus]
       (.prepare inimbus conf (master-inimbus-dir conf))
       (log-message "Starting Nimbus with conf " conf)
    -  (let [nimbus (nimbus-data conf inimbus)]
    +  (let [nimbus (nimbus-data conf inimbus)
    +        nimbus-leadership (nimbus-leadership conf)]
    --- End diff --
    
    Because nimbus-leadership is opening a new connection to ZK is there ever a 
possibility that the nimbus-leadership connection will be lost (networking 
glitch) and the other ZK will not be?  This could result in two nimbus 
instances both running at the same time. 


> Highly available Nimbus
> -----------------------
>
>                 Key: STORM-166
>                 URL: https://issues.apache.org/jira/browse/STORM-166
>             Project: Apache Storm (Incubating)
>          Issue Type: New Feature
>            Reporter: James Xu
>            Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/360
> The goal of this feature is to be able to run multiple Nimbus servers so that 
> if one goes down another one will transparently take over. Here's what needs 
> to happen to implement this:
> 1. Everything currently stored on local disk on Nimbus needs to be stored in 
> a distributed and reliable fashion. A DFS is perfect for this. However, as we 
> do not want to make a DFS a mandatory requirement to run Storm, the storage 
> of these artifacts should be pluggable (default to local filesystem, but the 
> interface should support DFS). You would only be able to run multiple NImbus 
> if you use the right storage, and the storage interface chosen should have a 
> flag indicating whether it's suitable for HA mode or not. If you choose local 
> storage and try to run multiple Nimbus, one of the Nimbus's should fail to 
> launch.
> 2. Nimbus's should register themselves in Zookeeper. They should use a leader 
> election protocol to decide which one is currently responsible for launching 
> and monitoring topologies.
> 3. StormSubmitter should find the Nimbus to connect to via Zookeeper. In case 
> the leader changes during submission, it should use a retry protocol to try 
> reconnecting to the new leader and attempting submission again.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to