Yeah I also think you hit STORM-1928
<https://issues.apache.org/jira/browse/STORM-1928>. ShellSpout should not
check heartbeat while there's no interaction between ShellSpout and
subprocess due to max spout pending.

Btw, release vote for Storm 1.0.2 RC4 is now live. I encourage you to
participate the vote so that we can feel 1.0.2 as more stable.

Thanks,
Jungtaek Lim (HeartSaVioR)

2016년 7월 30일 (토) 오전 4:43, Andrew Montalenti <[email protected]>님이 작성:

> Hi Tim,
>
> You're actually hitting a Shell Spout death failure that we also
> identified in production at Parse.ly using streamparse. It has to do with
> the ShellSpout implementation.
>
> If a ShellBolt does this, Storm automatically restarts the faulty
> component. But if a ShellSpout does it, it hangs in just the way you're
> describing.
>
> There is actually a fix pending for this in Storm 1.0.2 that (we think)
> addresses this issue, described in the JIRA issue STORM-1928
> <https://issues.apache.org/jira/browse/STORM-1928>. Since this release is now
> available on Github <https://github.com/apache/storm/releases/tag/v1.0.2>,
> you may want to give it a try and see if the issue goes away.
>
> It would be good for the community to know if this actually is the issue
> that fixes things in 1.0.x. We are actually testing some patches against
> the 0.9.x line that do the same over at Parse.ly.
>
> --
> Andrew Montalenti | CTO, Parse.ly
>
> On Thu, Jul 28, 2016 at 1:30 PM, Tim Hopper <[email protected]
> > wrote:
>
>> I’m running streamparse3-based topologies on Storm 1.0.1.
>>
>> I’m able to improve my throughput by increasing the max pending tuples.
>> However, the topology runs for a while and then dies. I get this message in
>> the logs:
>>
>>
>> 2016-07-28 16:21:21.946 o.a.s.s.ShellSpout [ERROR] Halting process:
>> ShellSpout died.
>> java.lang.RuntimeException: subprocess heartbeat timeout
>> at
>> org.apache.storm.spout.ShellSpout$SpoutHeartbeatTimerTask.run(ShellSpout.java:275)
>> [storm-core-1.0.1.jar:1.0.1]
>> at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>> [?:1.8.0_91]
>> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>> [?:1.8.0_91]
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>> [?:1.8.0_91]
>> at
>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>> [?:1.8.0_91]
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>> [?:1.8.0_91]
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>> [?:1.8.0_91]
>> at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
>> 2016-07-28 16:21:21.955 o.a.s.d.executor [ERROR]
>>
>>
>>
>> The bizarre thing to me is that the Storm UI gives no indication of
>> what’s going on. No tuples fail. No errors appear. The storm metrics
>> just stop changing. The worker processes aren’t restarted. All my statsd
>> metrics flatline. It just dies.
>>
>> Can anyone help me troubleshoot this? From states, I can see that I’m not
>> running out of system memory. Perhaps the heap is filling up?
>>
>
>

Reply via email to