Surajeet created STORM-3773:
-------------------------------
Summary: Worker Reassignment - Difference between Storm 2.x and
Storm 1.x
Key: STORM-3773
URL: https://issues.apache.org/jira/browse/STORM-3773
Project: Apache Storm
Issue Type: Bug
Affects Versions: 2.2.0
Reporter: Surajeet
We are currently on Storm 1.2.1 and was in the process of upgrading it to Storm
2.2.0
Observed the below while upgrading it to 2.2.0:
1) In a storm cluster (4 nodes) with 8 topologies running ( with a mapping of
1-1 between worker and topologies), when i bring down nimbus,supervisor in one
of the node (let's say Node 1, which is not nimbus leader) the workers running
on that node gets reassigned to other 3, even though it is running on that node
(Node 1). So i have 2 worker process for the same topology running at the same
time ( saw the behaviour with or without using pacemaker). The worker process
does get killed when nimbus and supervisor is brought up in Node 1
2) Observed from worker logs that it sends heartbeat to local supervisor and
nimbus leader , which with 1.2.1 used to happen using Zookeeper ( i saw this
behaviour in 2.2.0 with or without using Pacemaker).
If i bring down nimbus and supervisor on node where nimbus is a leader, it
reassigns worker processes and in some cases leads to zombie worker processess
( is not killed when storm kill is executed)
These above behaviour (reassignment of worker) doesn't happen with Storm 1.2.1
Since this is a fundamental design change between 1.x and 2.x , are there any
documentation which describes it in detail? ( couldn't find from Release Notes)
(I am raising this as a bug because its preventing us from moving to 2.2.0 due
to the issue mentioned in 2) )
--
This message was sent by Atlassian Jira
(v8.3.4#803005)