Christoph Goller wrote:
Problem: In TermInfosReader index (every 128th term) skipOffsets are not
stored! Due to documentation getIndexOffset returns the offset of the greatest
index entry which is less than term. I believe this is not true it may
deliver the term itself! If we seek for a term that is in the index, this
term and its termInfo will not be read from the enumerator by scanEnum and
consequently no skipOffset will be found, even if present. This could lead
to serious problems when skipTo is used, couldnīt it?

Yes, this does look like a problem.


Possible Solution: Store skipOffset in *.tii too.

I think that's a good solution. We should change TermInfosWriter.FORMAT from -1 to -2 and then use that to keep SegmentTermEnum.next() back-compatible, since folks may have created indexes with 1.4RC2. The simplest way to do this would be to disable skipTo() when TermInfosWriter.FORMAT is -1, by setting skipInterval to Integer.MAX_VALUE, as is done for 1.3 indexes.


Shall I do this, or would you like to?

Thanks so much for finding things like this!

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to