Yuzhao Chen created STORM-3112:
----------------------------------

             Summary: Incremental scheduling supports
                 Key: STORM-3112
                 URL: https://issues.apache.org/jira/browse/STORM-3112
             Project: Apache Storm
          Issue Type: Improvement
          Components: storm-server
    Affects Versions: 2.0.0
            Reporter: Yuzhao Chen
            Assignee: Yuzhao Chen
             Fix For: 2.0.0


As https://issues.apache.org/jira/browse/STORM-3093 described, now the 
scheduling work for a round is a complete scan and computation for all the 
topologies on cluster, which is a very heavy work when topologies increment to 
hundreds.

So this JIRA is to refactor the scheduling logic that only care about 
topologies that need to.

Promotions list:
1. Cache the id to storm base mapping which reduce the pressure to ZooKeeper.
2. Only schedule the topologies that need to: with dead executors or not enough 
running workers.
3. For some schedulers we still need a full scheduling, i.e. IsolationScheduler.
4. Cache the scheduling resource bestride multi scheduling round, i.e. nodeId 
-> used slot, nodeId -> used resource, nodeId -> totalResource.

Cause in https://issues.apache.org/jira/browse/STORM-3093 i already cache the 
storm-id -> executors mapping, now for a scheduling round, thing we will do:
1. Scan all the active storm bases( cached ) and local 
storm-conf/storm-topology, then to refresh the heartbeats cache, and we will 
know which topologies need to schedule.
2. Compute scheduleAssignment only for need scheduling topologies.

About robustness when nimbus restarts:
1. The cached storm-bases are taken care of by ILocalAssignmentsBackend.
2. the scheduling cache will be refresh for the first time scheduling through a 
full topologies scheduling.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to