Right, so after looking at what was happening in SegmentInfos again I noticed I was saving to the Datastore on IndexOutput.flush but not on close. Persisting the file on close solved this particular problem. Sorry about that.
On Sat, Aug 15, 2009 at 1:03 AM, Bryan Swift <bryan.j.sw...@gmail.com>wrote: > I'm attempting to create a Directory implementation (lucene-core 2.4.1) to > sit on top of Google's App Engine Datastore (written in Scala). In the > process of doing this I found something odd for which I'm hoping there is a > relatively simple solution. > When instantiating a new IndexWriter with my Directory implementation > (which uses Datastore based IndexInput and IndexOutput classes) the checksum > of the segments file (segments_1 because there is nothing in the index yet) > varies when calculated by ChecksumIndexOutput vs ChecksumIndexInput. Of > course, this causes a CorruptIndexException to be thrown at line 248. The > interesting thing is the array of bytes being written by my > DatastoreIndexOutput is the same array of bytes being ready by > DatastoreIndexInput. I've also noticed the difference between the checksums > is consistently that checksumThen (line 246 in SegmentInfos) is one less > than checksumNow (line 245 in SegmentInfos). > > In an attempt to gain further information about this problem I added CRC32 > objects to my IndexInput and IndexOutput definitions to in order to peak in > on their values while debugging and it seems the IndexInput and IndexOutput > classes I defined have the same checksum after reading and writing all the > bits. > > The source for my implementation to this point can be found at > http://github.com/bryanjswift/quotidian/tree/search-checksums under > src/main/scala/quotidian/search > > Any insight or assistance would be very much appreciated at this point. > > Cheers. > Bryan >