Surajeet created STORM-3773:
-------------------------------

             Summary: Worker Reassignment - Difference between Storm 2.x  and 
Storm 1.x
                 Key: STORM-3773
                 URL: https://issues.apache.org/jira/browse/STORM-3773
             Project: Apache Storm
          Issue Type: Bug
    Affects Versions: 2.2.0
            Reporter: Surajeet


We are currently on Storm 1.2.1 and was in the process of upgrading it to Storm 
2.2.0
 Observed the below while upgrading it to 2.2.0:

1) In a storm cluster (4 nodes) with 8 topologies running  ( with a mapping of 
1-1 between worker and topologies), when i bring down nimbus,supervisor in one 
of the node (let's say Node 1, which is not nimbus leader) the workers running 
on that node gets reassigned to other 3, even though it is running on that node 
(Node 1). So i have 2 worker process for the same topology running at the same 
time ( saw the behaviour with or without using pacemaker). The worker process 
does get killed when nimbus and supervisor is brought up in Node 1

2) Observed from worker logs that it sends heartbeat to local supervisor and 
nimbus leader , which with 1.2.1 used to happen using Zookeeper ( i saw this 
behaviour in 2.2.0 with or without using Pacemaker). 
 If i bring down nimbus and supervisor on node where nimbus is a leader, it 
reassigns worker processes and in some cases leads to zombie worker processess 
( is not killed when storm kill is executed)

These above behaviour (reassignment of worker) doesn't happen with Storm 1.2.1

Since this is a fundamental design change between 1.x and 2.x , are there any 
documentation which describes it in detail? ( couldn't find from Release Notes)

(I am raising this as a bug because its preventing us from moving to 2.2.0 due 
to the issue mentioned in 2) )

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to