[ 
https://issues.apache.org/jira/browse/STORM-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuzhao Chen updated STORM-2693:
-------------------------------
    Description: 
Now for a storm cluster with 40 hosts [with 32 cores/128G memory] and hundreds 
of topologies, nimbus submission and killing will take about minutes to finish. 
For example, for a cluster with 300 hundred of topologies,it will take about 8 
minutes to submit a topology, this affect our efficiency seriously.

So, i check out the nimbus code and find two factor that will effect nimbus 
submission/killing time for a scheduling round:
* read existing-assignments from zookeeper for every topology [will take about 
4 seconds for a 300 topologies cluster]
* read all the workers heartbeats and update the state to nimbus cache [will 
take about 30 seconds for a 300 topologies cluster]
the key here is that Storm now use zookeeper to collect heartbeats [not RPC], 
and also keep physical plan [assignments] using zookeeper which can be totally 
local in nimbus.

So, i think we should make some changes to storm's heartbeats and assignments 
management.

For assignment promotion:
1. nimbus will put the assignments in memory also           cache a copy to 
local disk
2. when restart or HA leader trigger nimbus will                 recover 
assignments from disk
3. nimbus will tell supervisor its assignment every time     scheduling
4. supervisor will sync assignments at fixed time


For heartbeats promotion:
1. workers will report executors ok or wrong to supervisor at fixed time
2. supervisor will report workers heartbeats to nimbus at fixed time
3. if supervisor die, it will tell nimbus through runtime hook
    or let nimbus find it through aware supervisor if survive 
4. let supervisor decide if worker is running ok or invalid , supervisor will 
tell nimbus which executors of every topology are ok


  was:
Now for a storm cluster with 40 hosts [with 32 cores/128G memory] and hundreds 
of topologies, nimbus submission and killing will take about minutes to finish. 
For example, for a cluster with 300 hundred of topologies,it will take about 8 
minutes to submit a topology, this affect our efficiency seriously.

So, i check out the nimbus code and find two factor that will effect nimbus 
submission/killing time for a scheduling round:
* read existing-assignments from zookeeper for every topology [will take about 
4 seconds for a 300 topologies cluster]
* read all the workers heartbeats and update the state to nimbus cache [will 
take about 30 seconds for a 300 topologies cluster]
the key here is that Storm now use zookeeper to aware heartbeats [not RPC], and 
also keep physical plan [assignments] to zookeeper.

So, i think we should make some change to storm's heartbeats and assignments 
management.

For assignment promotion:
1. nimbus will put the assignments in memory also           cache a copy to 
local disk
2. when restart or HA leader trigger nimbus will                 recover 
assignments from disk
3. nimbus will tell supervisor its assignment every time     scheduling
4. supervisor will sync assignments at fixed time


For heartbeats promotion:
1. workers will report executors ok or wrong to supervisor at fixed time
2. supervisor will report workers heartbeats to nimbus at fixed time
3. if supervisor die, it will tell nimbus through runtime hook
    or let nimbus find it through aware supervisor if survive 
4. let supervisor decide if worker is running ok or invalid , supervisor will 
tell nimbus which executors of every topology are ok



> Topology submission or kill takes too much time when topologies grow to a few 
> hundred
> -------------------------------------------------------------------------------------
>
>                 Key: STORM-2693
>                 URL: https://issues.apache.org/jira/browse/STORM-2693
>             Project: Apache Storm
>          Issue Type: Improvement
>          Components: storm-core
>    Affects Versions: 0.9.6, 1.0.2, 1.1.0, 1.0.3
>            Reporter: Yuzhao Chen
>             Fix For: 1.1.0
>
>         Attachments: 2FA30CD8-AF15-4352-992D-A67BD724E7FB.png, 
> D4A30D40-25D5-4ACF-9A96-252EBA9E6EF6.png
>
>
> Now for a storm cluster with 40 hosts [with 32 cores/128G memory] and 
> hundreds of topologies, nimbus submission and killing will take about minutes 
> to finish. For example, for a cluster with 300 hundred of topologies,it will 
> take about 8 minutes to submit a topology, this affect our efficiency 
> seriously.
> So, i check out the nimbus code and find two factor that will effect nimbus 
> submission/killing time for a scheduling round:
> * read existing-assignments from zookeeper for every topology [will take 
> about 4 seconds for a 300 topologies cluster]
> * read all the workers heartbeats and update the state to nimbus cache [will 
> take about 30 seconds for a 300 topologies cluster]
> the key here is that Storm now use zookeeper to collect heartbeats [not RPC], 
> and also keep physical plan [assignments] using zookeeper which can be 
> totally local in nimbus.
> So, i think we should make some changes to storm's heartbeats and assignments 
> management.
> For assignment promotion:
> 1. nimbus will put the assignments in memory also           cache a copy to 
> local disk
> 2. when restart or HA leader trigger nimbus will                 recover 
> assignments from disk
> 3. nimbus will tell supervisor its assignment every time     scheduling
> 4. supervisor will sync assignments at fixed time
> For heartbeats promotion:
> 1. workers will report executors ok or wrong to supervisor at fixed time
> 2. supervisor will report workers heartbeats to nimbus at fixed time
> 3. if supervisor die, it will tell nimbus through runtime hook
>     or let nimbus find it through aware supervisor if survive 
> 4. let supervisor decide if worker is running ok or invalid , supervisor will 
> tell nimbus which executors of every topology are ok



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to