Yuzhao Chen created STORM-3112:
----------------------------------
Summary: Incremental scheduling supports
Key: STORM-3112
URL: https://issues.apache.org/jira/browse/STORM-3112
Project: Apache Storm
Issue Type: Improvement
Components: storm-server
Affects Versions: 2.0.0
Reporter: Yuzhao Chen
Assignee: Yuzhao Chen
Fix For: 2.0.0
As https://issues.apache.org/jira/browse/STORM-3093 described, now the
scheduling work for a round is a complete scan and computation for all the
topologies on cluster, which is a very heavy work when topologies increment to
hundreds.
So this JIRA is to refactor the scheduling logic that only care about
topologies that need to.
Promotions list:
1. Cache the id to storm base mapping which reduce the pressure to ZooKeeper.
2. Only schedule the topologies that need to: with dead executors or not enough
running workers.
3. For some schedulers we still need a full scheduling, i.e. IsolationScheduler.
4. Cache the scheduling resource bestride multi scheduling round, i.e. nodeId
-> used slot, nodeId -> used resource, nodeId -> totalResource.
Cause in https://issues.apache.org/jira/browse/STORM-3093 i already cache the
storm-id -> executors mapping, now for a scheduling round, thing we will do:
1. Scan all the active storm bases( cached ) and local
storm-conf/storm-topology, then to refresh the heartbeats cache, and we will
know which topologies need to schedule.
2. Compute scheduleAssignment only for need scheduling topologies.
About robustness when nimbus restarts:
1. The cached storm-bases are taken care of by ILocalAssignmentsBackend.
2. the scheduling cache will be refresh for the first time scheduling through a
full topologies scheduling.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)