GitHub user danny0405 opened a pull request:
https://github.com/apache/storm/pull/2700
[STORM-3093] Cache the storm id to executors mapping on master to avoâ¦
# What this patch for
Now nimbus will collect all the topologies's conf/topology-ser/storm-base
to compute in a scheduling round, which is a very heavy work. The scheduling
will still take to minutes even we now change to PRC heartbeats and assignment
distribution.
So i decide to redesign the scheduler, so we can only schedule the
topologies that need to: that have dead workers or not enough number workers.
Here i checkout out the code and found that the id->executors mapping is
computed every time for every topology, which is really a heavy computation and
totally not that necessary, because this mapping is fixed invariable for a
topology unless we rebalance or kill it.
So i refactor the code a little here, and this is more powerful after the
scheduler is resigned for delta-scheduling[ which is very lightweight even
there are thousands of topologies on one cluster.]
For now this is enough for us.
This is the JIRA:
[STORM3093](https://issues.apache.org/jira/browse/STORM-3093).
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/danny0405/storm delta-schedule
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/storm/pull/2700.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2700
----
commit 85fc30f4f5970ccc6aa67a4e27090b2fd29cd76b
Author: chenyuzhao <chenyuzhao@...>
Date: 2018-06-02T09:46:58Z
[STORM-3093] Cache the storm id to executors mapping on master to avoid
repeat computation
----
---