I'm trying to solve a problem of the history server spamming my logs with EOFExceptions when it tries to read a history file from HDFS that is both lz4 compressed and incomplete. The actual exception is:
java.io.EOFException: Stream ended prematurely at net.jpountz.lz4.LZ4BlockInputStream.readFully(LZ4BlockInputStream.java:218) at net.jpountz.lz4.LZ4BlockInputStream.refill(LZ4BlockInputStream.java:192) at net.jpountz.lz4.LZ4BlockInputStream.read(LZ4BlockInputStream.java:117) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178) at java.io.InputStreamReader.read(InputStreamReader.java:184) at java.io.BufferedReader.fill(BufferedReader.java:161) at java.io.BufferedReader.readLine(BufferedReader.java:324) at java.io.BufferedReader.readLine(BufferedReader.java:389) at scala.io.BufferedSource$BufferedLineIterator.hasNext(BufferedSource.scala:67 ) at org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala: 55) at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$hi story$FsHistoryProvider$$replay(FsHistoryProvider.scala:443) at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$10.apply(FsHistor yProvider.scala:278) at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$10.apply(FsHistor yProvider.scala:275) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.sc ala:251) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.sc ala:251) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:5 9) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$hi story$FsHistoryProvider$$mergeApplicationListing(FsHistoryProvider.scala:275 ) at org.apache.spark.deploy.history.FsHistoryProvider$$anonfun$checkForLogs$1$$a non$2.run(FsHistoryProvider.scala:209) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11 42) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6 17) at java.lang.Thread.run(Thread.java:745) The bit I'm struggling with is handling this in ReplayListenerBus.scala - I tried adding the following to the try/catch: case eof: java.io.EOFException => logWarning(s"EOFException (probably due to incomplete lz4) at $sourceName", eof) but this never seems to get triggered - it still dumps the whole exception out to the log. I feel like there's something basic I'm missing for the exception not to be caught by the try/catch in ReplayListenerBus. Can anyone point me in the right direction? Thanks, Andrew
smime.p7s
Description: S/MIME cryptographic signature