Hey All,
I have a fun issue I'm dealing with at the junction of lucene and spark.

I have an RDD[(key, iterator1, iterator2)]

I run a mapPartitions on the RDD, and for each partition, I create a
ramDirectory, I index all of the elements in interator1, and then I search
the index for each element in iterator2. The issue that I am having is all
of my searches on the ramDirectory fail with an "EOF exception" Here is an
example of one of the EOF exceptions:

java.lang.RuntimeException: java.io.EOFException: seek beyond EOF:
pos=69377 vs length=53924:
RAMInputStream(name=RAMInputStream(name=_1bjl_Lucene54_0.dvd)
[slice=randomaccess]), java.lang.RuntimeException: java.io.EOFException:
seek beyond EOF: pos=98833 vs length=48835:



To recap: each executor loops through, create a ram directory, writes to
it, and then reads from it.


I have been trying for the past few days to address this issue but I have
been unable to find out whats going on. Any hint as to what might be
happening here?

Best,
Tom Hirschfeld

Reply via email to