Parth Brahmbhatt created STORM-654:
--------------------------------------

             Summary: Create a thrift API to discover nimbus so all the clients 
are not forced to contact zookeeper.
                 Key: STORM-654
                 URL: https://issues.apache.org/jira/browse/STORM-654
             Project: Apache Storm
          Issue Type: Sub-task
            Reporter: Parth Brahmbhatt
            Assignee: Parth Brahmbhatt


Current implementation of Nimbus-HA requires each nimbus client to discover 
nimbus hosts by contacting zookeeper. In order to reduce the load on zookeeper 
we could expose a thrift API as described in the future improvement section of 
the Nimbus HA design doc. 

We will add an extra field in ClusterSummary structure called nimbuses.

struct ClusterSummary {
  1: required list<SupervisorSummary> supervisors;
  2: required i32 nimbus_uptime_secs;
  3: required list<TopologySummary> topologies;
  4: required list<NimbusSummary> nimbuses;
}

struct NimbusSummary {
    1: require string host;
    2: require int port;
    3: require int uptimeSecs;
    4: require boolean isLeader;
    5: require string version;
    6: optional list<string> local_storm_ids; //need a better name but these 
are list of storm-ids for which this nimbus host has the code available locally.
}

We will create a nimbus.hosts configuration which will serve as the seed list 
of nimbus hosts. Any nimbus host can serve the read requests so any client can 
issue getClusterSummary call and they can extract the leader nimbus summary 
from the list of nimbuses. All nimbus hosts will cache this information to 
reduce the load on zookeeper. 

In addition we can add a RedirectException. When a request that can only be 
served by leader nimbus (i.e. submit, kill, rebalance, deactivate, activate) is 
issued against a non leader nimbus, the non leader nimbus will throw a 
RedirectException and the client will handle the exception by refreshing their 
leader nimbus host and contacting that host as part of retry. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to