Scott Bessler created STORM-1041:
------------------------------------

             Summary: Topology with kafka spout stops processing
                 Key: STORM-1041
                 URL: https://issues.apache.org/jira/browse/STORM-1041
             Project: Apache Storm
          Issue Type: Bug
    Affects Versions: 0.9.5
            Reporter: Scott Bessler
            Priority: Critical


Topology:
 KafkaSpout (1 task/executor) -> bolt that does grouping (1 task/executor) -> 
bolt that does processing (176 tasks/executors)
 8 workers
 Using Netty

Sometimes when a worker dies (we've seen it happen due to an OOM or load from a 
co-located worker) it will try to restart on the same node, then 20s later 
shutdown and start on another node.

While the worker was dead and then killed, other workers have had netty drop 
messages. In theory these messages should timeout and be replayed. Our message 
timeout is 30s. 

However these messages never timeout, and the MAX_SPOUT_PENDING has been 
reached, so no more tuples are emitted/processed.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to