I’m running streamparse3-based topologies on Storm 1.0.1. I’m able to improve my throughput by increasing the max pending tuples. However, the topology runs for a while and then dies. I get this message in the logs:
2016-07-28 16:21:21.946 o.a.s.s.ShellSpout [ERROR] Halting process: ShellSpout died. java.lang.RuntimeException: subprocess heartbeat timeout at org.apache.storm.spout.ShellSpout$SpoutHeartbeatTimerTask.run(ShellSpout.java:275) [storm-core-1.0.1.jar:1.0.1] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_91] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_91] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_91] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_91] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91] 2016-07-28 16:21:21.955 o.a.s.d.executor [ERROR] The bizarre thing to me is that the Storm UI gives no indication of what’s going on. No tuples fail. No errors appear. The storm metrics just stop changing. The worker processes aren’t restarted. All my statsd metrics flatline. It just dies. Can anyone help me troubleshoot this? From states, I can see that I’m not running out of system memory. Perhaps the heap is filling up?
