[
https://issues.apache.org/jira/browse/STORM-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
StaticMian updated STORM-3016:
------------------------------
Description:
When a job having large amount of parallelism components( total parallelism
rises to 5000 for example) been submmited to storm cluster, Nimubs might get
crashed, the work flow is as below:
1) Nimbus computting assignment
2) Nimbus sending assignment to zk
{color:#ff0000}3) When assignment mapping info string is too long due to total
parallelism of job being too large, sending this info to zk will fail (zNode
datalength set default is 1M ){color}
{color:#333333}4) Nimbus keeps trying sending this assignment info, after some
times, it gives up and crashed, with that happend, the stablity of the cluster
will be greatly impacted{color}
was:
When a job having large amount of parallelism components( total parallelism
rises to 5000 for example) been submmited to storm cluster, Nimubs might get
crashed, the work flow is as below:
1) Nimbus computting assignment
2) Nimbus sending assignment to zk
{color:#ff0000}3) When assignment mapping info string is too long due to total
parallelism of job being too large, sending this info to zk will fail (zNode
datalength set default is 1M ){color}
{color:#333333}4) Nimbus getting assignment for this job from zk fails , then
it gives up and crashed, with that happend, the stablity of the cluster will be
greatly impacted{color}
> Nimbus gets down when job has large amount of parallelism components
> --------------------------------------------------------------------
>
> Key: STORM-3016
> URL: https://issues.apache.org/jira/browse/STORM-3016
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-core
> Affects Versions: 2.0.0
> Reporter: StaticMian
> Priority: Major
> Labels: security
> Fix For: 2.0.0
>
> Attachments: nimbus.log
>
> Original Estimate: 96h
> Remaining Estimate: 96h
>
> When a job having large amount of parallelism components( total parallelism
> rises to 5000 for example) been submmited to storm cluster, Nimubs might get
> crashed, the work flow is as below:
> 1) Nimbus computting assignment
> 2) Nimbus sending assignment to zk
> {color:#ff0000}3) When assignment mapping info string is too long due to
> total parallelism of job being too large, sending this info to zk will fail
> (zNode datalength set default is 1M ){color}
> {color:#333333}4) Nimbus keeps trying sending this assignment info, after
> some times, it gives up and crashed, with that happend, the stablity of the
> cluster will be greatly impacted{color}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)