[ 
https://issues.apache.org/jira/browse/STORM-261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945545#comment-13945545
 ] 

Robert Joseph Evans commented on STORM-261:
-------------------------------------------

bq. I haven't actually verified this, but when it's scheduled out of the 
topology, shouldn't it stop receiving new data?

We ran into this issue while running some tests. Someone accidentally had 
brought up a supervisor on the same node as nimbus.  They took the supervisor 
back down, and then noticed that things were a bit out of whack.  There was a 
word count topology running completely on a node that wasn't a part of the 
cluster any more.  Out of curiosity I rebalanced the topology to see what would 
happen, and now there were two copies of the topology running.  Looking at the 
logs both appeared to be processing data. 

I marked this as minor because like [~jmlogan] stated before I didn't see much 
of a way this would cause problems in the real world.  Thinking about it 
further I can see some use cases where if a spout is left active it could be 
causing problems, like consuming data that is never fully processes, or by 
continuing to process data after the topology has been killed.

> Workers should commit suicide if not scheduled any more.
> --------------------------------------------------------
>
>                 Key: STORM-261
>                 URL: https://issues.apache.org/jira/browse/STORM-261
>             Project: Apache Storm (Incubating)
>          Issue Type: Bug
>    Affects Versions: 0.9.2-incubating
>            Reporter: Robert Joseph Evans
>            Priority: Minor
>
> I know this is a bit far fetched.
> If for some reason a supervisor dies and does not come back up again, dead 
> HDD for example, but the workers remain up, and the scheduler decides to move 
> the worker to a new host, a rebalance for instance, the old workers will 
> never go away.  Ideally the worker should know that it is not running in the 
> correct place any more and die instead of waiting for the supervisor to kill 
> it.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to