GitHub user danny0405 opened a pull request:

    https://github.com/apache/storm/pull/2700

    [STORM-3093] Cache the storm id to executors mapping on master to avo…

    # What this patch for
    Now nimbus will collect all the topologies's conf/topology-ser/storm-base 
to compute in a scheduling round, which is a very heavy work. The scheduling 
will still take to minutes even we now change to PRC heartbeats and assignment 
distribution.
    
    So i decide to redesign the scheduler, so we can only schedule the 
topologies that need to: that have dead workers or not enough number workers.
    
    Here i checkout out the code and found that the id->executors mapping is 
computed every time for every topology, which is really a heavy computation and 
totally not that necessary, because this mapping is fixed invariable for a 
topology unless we rebalance or kill it.
    
    So i refactor the code a little here, and this is more powerful after the 
scheduler is resigned for delta-scheduling[ which is very lightweight even 
there are thousands of topologies on one cluster.]
    
    For now this is enough for us.
    
    This is the JIRA: 
[STORM3093](https://issues.apache.org/jira/browse/STORM-3093).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/danny0405/storm delta-schedule

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/storm/pull/2700.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2700
    
----
commit 85fc30f4f5970ccc6aa67a4e27090b2fd29cd76b
Author: chenyuzhao <chenyuzhao@...>
Date:   2018-06-02T09:46:58Z

    [STORM-3093] Cache the storm id to executors mapping on master to avoid 
repeat computation

----


---

Reply via email to