In 0.4.0 we never recover from this until restarting I believe. There are two issues : why we don’t recover ( which has a temporary fix with https://github.com/apache/metron/pull/741 and why we have the connection closed in the first place. I don’t think the hdfs streams output *why* the connection is closed, so this is difficult to track down, thus recovery is the first and more important step I think.
On September 15, 2017 at 08:53:59, Nick Allen ([email protected]) wrote: Hi Laurens - Sorry for such a delayed response, but are you still seeing this issue? Does data stop being indexed when this happens? Any other clues you can offer? On Thu, Aug 17, 2017 at 11:58 AM Laurens Vets <[email protected]> wrote: > Hello, > > I suddenly receive the following error messages: > > java.nio.channels.ClosedChannelException at > > org.apache.hadoop.hdfs.DFSOutputStream.checkClosed(DFSOutputStream.java:1521) > at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:104) > at > > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58) > at java.io.DataOutputStream.write(DataOutputStream.java:107) at > java.io.FilterOutputStream.write(FilterOutputStream.java:97) > at > org.apache.metron.writer.hdfs.SourceHandler.handle(SourceHandler.java:71) > at org.apache.metron.writer.hdfs.HdfsWriter.write(HdfsWriter.java:116) > at > > org.apache.metron.writer.BulkWriterComponent.write(BulkWriterComponent.java:138) > at > > org.apache.metron.writer.bolt.BulkMessageWriterBolt.execute(BulkMessageWriterBolt.java:117) > at > > org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) > at > > org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466) > at > > org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40) > at > > org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451) > at > > org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430) > at > > org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) > at > > org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853) > at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at > clojure.lang.AFn.run(AFn.java:22) > at java.lang.Thread.run(Thread.java:745) > > This is on Metron 0.4.0-release. I don't immediately see anything wrong > on the OS or anywhere else... Any idea what might be going on? >
