I'm attempting to create a Directory implementation (lucene-core 2.4.1) to sit on top of Google's App Engine Datastore (written in Scala). In the process of doing this I found something odd for which I'm hoping there is a relatively simple solution. When instantiating a new IndexWriter with my Directory implementation (which uses Datastore based IndexInput and IndexOutput classes) the checksum of the segments file (segments_1 because there is nothing in the index yet) varies when calculated by ChecksumIndexOutput vs ChecksumIndexInput. Of course, this causes a CorruptIndexException to be thrown at line 248. The interesting thing is the array of bytes being written by my DatastoreIndexOutput is the same array of bytes being ready by DatastoreIndexInput. I've also noticed the difference between the checksums is consistently that checksumThen (line 246 in SegmentInfos) is one less than checksumNow (line 245 in SegmentInfos).
In an attempt to gain further information about this problem I added CRC32 objects to my IndexInput and IndexOutput definitions to in order to peak in on their values while debugging and it seems the IndexInput and IndexOutput classes I defined have the same checksum after reading and writing all the bits. The source for my implementation to this point can be found at http://github.com/bryanjswift/quotidian/tree/search-checksums under src/main/scala/quotidian/search Any insight or assistance would be very much appreciated at this point. Cheers. Bryan