Here is an example of a full stack trace java.lang.RuntimeException: java.io.EOFException: seek beyond EOF: pos=3080604 vs length=533151: RAMInputStream(name=RAMInputStream(name=_cw4e_Lucene54_0.dvd) [slice=randomaccess]) at org.apache.lucene.util.packed.DirectReader$DirectPackedReader48.get(DirectReader.java:307) at org.apache.lucene.codecs.lucene54.Lucene54DocValuesProducer$2.get(Lucene54DocValuesProducer.java:501) at org.apache.lucene.util.LongValues.get(LongValues.java:45) at com.uber.pindrop.lib.lucene.index.IndexSearcher.LuceneIndex.getNumericDocValues(LuceneIndex.java:318) at com.uber.pindrop.lib.lucene.index.IndexSearcher.LuceneIndex.extractPlacesFromDocs(LuceneIndex.java:403) at com.uber.pindrop.lib.lucene.index.IndexSearcher.LuceneIndex.search(LuceneIndex.java:185) at com.uber.pindrop.spark.SessionEvaluation.SessionEvaluation$$anonfun$9$$anonfun$11.apply(SessionEvaluation.scala:334) at com.uber.pindrop.spark.SessionEvaluation.SessionEvaluation$$anonfun$9$$anonfun$11.apply(SessionEvaluation.scala:324) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35) at scala.collection.mutable.ListBuffer.foreach(ListBuffer.scala:45) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at com.uber.pindrop.spark.SessionEvaluation.SessionEvaluation$$anonfun$9.apply(SessionEvaluation.scala:324) at com.uber.pindrop.spark.SessionEvaluation.SessionEvaluation$$anonfun$9.apply(SessionEvaluation.scala:306) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:789) at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:789) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) at org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:332) at org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:330) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:958) at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:949) at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:889) at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:949) at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:694) at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:330) at org.apache.spark.rdd.RDD.iterator(RDD.scala:281) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:99) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.io.EOFException: seek beyond EOF: pos=3080604 vs length=533151: RAMInputStream(name=RAMInputStream(name=_cw4e_Lucene54_0.dvd) [slice=randomaccess]) at org.apache.lucene.store.RAMInputStream.seek(RAMInputStream.java:109) at org.apache.lucene.store.RAMInputStream$1.seek(RAMInputStream.java:150) at org.apache.lucene.store.IndexInput$1.readLong(IndexInput.java:149) at org.apache.lucene.util.packed.DirectReader$DirectPackedReader48.get(DirectReader.java:305) ... 35 more
On Fri, May 11, 2018 at 1:15 AM, Adrien Grand <jpou...@gmail.com> wrote: > Can you share the full stack trace? > > Le ven. 11 mai 2018 à 04:19, Tom Hirschfeld <tomhirschf...@gmail.com> a > écrit : > > > Hey All, > > I have a fun issue I'm dealing with at the junction of lucene and spark. > > > > I have an RDD[(key, iterator1, iterator2)] > > > > I run a mapPartitions on the RDD, and for each partition, I create a > > ramDirectory, I index all of the elements in interator1, and then I > search > > the index for each element in iterator2. The issue that I am having is > all > > of my searches on the ramDirectory fail with an "EOF exception" Here is > an > > example of one of the EOF exceptions: > > > > java.lang.RuntimeException: java.io.EOFException: seek beyond EOF: > > pos=69377 vs length=53924: > > RAMInputStream(name=RAMInputStream(name=_1bjl_Lucene54_0.dvd) > > [slice=randomaccess]), java.lang.RuntimeException: java.io.EOFException: > > seek beyond EOF: pos=98833 vs length=48835: > > > > > > > > To recap: each executor loops through, create a ram directory, writes to > > it, and then reads from it. > > > > > > I have been trying for the past few days to address this issue but I have > > been unable to find out whats going on. Any hint as to what might be > > happening here? > > > > Best, > > Tom Hirschfeld > > >