[
https://issues.apache.org/jira/browse/STORM-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875537#comment-17875537
]
Pedro Azevedo commented on STORM-3773:
--------------------------------------
Hey guys, I'm using Storm v2.6.1 and every time I restart the nimbus leader
(currently I have 3 for high availability) the workers get reassigned. I'm
using Zookeeper instead of Pacemaker.
> Worker Reassignment - Difference between Storm 2.x and Storm 1.x
> -----------------------------------------------------------------
>
> Key: STORM-3773
> URL: https://issues.apache.org/jira/browse/STORM-3773
> Project: Apache Storm
> Issue Type: Bug
> Affects Versions: 2.2.0
> Reporter: Surajeet
> Priority: Major
>
> We are currently on Storm 1.2.1 and was in the process of upgrading it to
> Storm 2.2.0
> Observed the below while upgrading it to 2.2.0:
> 1) In a storm cluster (4 nodes) with 8 topologies running ( with a mapping
> of 1-1 between worker and topologies), when i bring down nimbus,supervisor in
> one of the node (let's say Node 1, which is not nimbus leader) the workers
> running on that node gets reassigned to other 3, even though it is running on
> that node (Node 1). So i have 2 worker process for the same topology running
> at the same time ( saw the behaviour with or without using pacemaker). The
> worker process does get killed when nimbus and supervisor is brought up in
> Node 1
> 2) Observed from worker logs that it sends heartbeat to local supervisor and
> nimbus leader , which with 1.2.1 used to happen using Zookeeper ( i saw this
> behaviour in 2.2.0 with or without using Pacemaker).
> If i bring down nimbus and supervisor on node where nimbus is a leader, it
> reassigns worker processes and in some cases leads to zombie worker
> processess ( is not killed when storm kill is executed)
> These above behaviour (reassignment of worker) doesn't happen with Storm 1.2.1
> Since this is a fundamental design change between 1.x and 2.x , are there any
> documentation which describes it in detail? ( couldn't find from Release
> Notes)
> (I am raising this as a bug because its preventing us from moving to 2.2.0
> due to the issue mentioned in 2) )
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)