[ 
https://issues.apache.org/jira/browse/STORM-323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rick Kellogg updated STORM-323:
-------------------------------
    Component/s: storm-core

> Unacknowledged __tick and __metrics_tick tuples hangs worker processes
> ----------------------------------------------------------------------
>
>                 Key: STORM-323
>                 URL: https://issues.apache.org/jira/browse/STORM-323
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>    Affects Versions: 0.9.1-incubating
>         Environment: Storm:
> Nimbus, Supervisor and Zookeeper running on Centos 6.2 over m1.small 
> instances (1.7G mem, 1 CPU, 1 core)
> Netty as the transport
> Topology:
> 2 worker processes on the same supervisor instance each allocated 512 Mb of 
> heap
> Each of the worker processes have around 30 executors running around 112 
> tasks.
>            Reporter: Srinath
>            Priority: Critical
>
> Symptoms observed:
> 1. One of the bolts not getting executed after about 5 days of run
> 2. Spout gradually slows down and finally stops calling nextTuple()
> 3. Topology is non-functional since there is no exchange of tuples across 
> worker processes
> Notes from troubleshooting:
> 1. There is a transfer of data across worker processes but the bolt is not 
> receiving the tuples
> 2. backtype.storm.messaging.netty.Server#message_queue is not getting 
> consumed.
> 3. Later on found that there are several __tick and __metrics_tick tuples 
> piling up in memory over a period of time. This piling up is gradual and 
> probably the reason why it takes so long for it to cause any visible problems.
> I have shared access to thread dumps and topology layout at 
> https://drive.google.com/folderview?id=0B2F_3UACQZNESXpwZlA4MFlqSVU&usp=drive_web



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to