Hmm, the checksum is there to ensure all bits were persisted properly.

But one trickiness is we first write 4 0 bytes, then seek back and
write the checksum over those 4 bytes.  Could it be that the HBase
IndexOutput impl can't handle seeking back and overwriting?

If so, you should have a look at AppendingCodec, which fixes the
places in Lucene's default codec that seek backwards on write ...

Mike McCandless

http://blog.mikemccandless.com

On Mon, Jun 25, 2012 at 11:55 AM, Mihai Soloi <[email protected]> wrote:
> Hello everybody,
>
> I'm Mihai, a GSoC student, and I'm implementing an HBaseDirectory for Lucene
> [1] in order to use it on James mailbox indexing. I've implemented
> HIndexOutput/Input, they're persisting the segments file just fine in an
> HBase table, but when I try to get an IndexWriter from my directory, it
> reads the segment_N file but due to the check in SegmentInfos the current
> checksum is different from the persisted one. I've tried finding a solution
> but I can't reach one. Do you guys have any idea why this happens? This is
> the stack trace:
>
> org.apache.lucene.index.CorruptIndexException: checksum mismatch in segments
> file (resource: ChecksumIndexInput(anonymous IndexInput))
>    at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:335)
>    at
> org.apache.lucene.index.IndexFileDeleter.<init>(IndexFileDeleter.java:182)
>    at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1168)
>    at
> org.apache.james.mailbox.lucene.hbase.IndexingTest.getWriter(IndexingTest.java:82)
>    at
> org.apache.james.mailbox.lucene.hbase.IndexingTest.testIndexWriter(IndexingTest.java:123)
>
> [1] http://code.google.com/a/apache-extras.org/p/mailbox-lucene-index-hbase/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to