Parth Brahmbhatt created STORM-654:
--------------------------------------
Summary: Create a thrift API to discover nimbus so all the clients
are not forced to contact zookeeper.
Key: STORM-654
URL: https://issues.apache.org/jira/browse/STORM-654
Project: Apache Storm
Issue Type: Sub-task
Reporter: Parth Brahmbhatt
Assignee: Parth Brahmbhatt
Current implementation of Nimbus-HA requires each nimbus client to discover
nimbus hosts by contacting zookeeper. In order to reduce the load on zookeeper
we could expose a thrift API as described in the future improvement section of
the Nimbus HA design doc.
We will add an extra field in ClusterSummary structure called nimbuses.
struct ClusterSummary {
1: required list<SupervisorSummary> supervisors;
2: required i32 nimbus_uptime_secs;
3: required list<TopologySummary> topologies;
4: required list<NimbusSummary> nimbuses;
}
struct NimbusSummary {
1: require string host;
2: require int port;
3: require int uptimeSecs;
4: require boolean isLeader;
5: require string version;
6: optional list<string> local_storm_ids; //need a better name but these
are list of storm-ids for which this nimbus host has the code available locally.
}
We will create a nimbus.hosts configuration which will serve as the seed list
of nimbus hosts. Any nimbus host can serve the read requests so any client can
issue getClusterSummary call and they can extract the leader nimbus summary
from the list of nimbuses. All nimbus hosts will cache this information to
reduce the load on zookeeper.
In addition we can add a RedirectException. When a request that can only be
served by leader nimbus (i.e. submit, kill, rebalance, deactivate, activate) is
issued against a non leader nimbus, the non leader nimbus will throw a
RedirectException and the client will handle the exception by refreshing their
leader nimbus host and contacting that host as part of retry.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)