Re: TermDocs.skipTo()

Christoph Goller Wed, 07 Apr 2004 03:05:09 -0700

Doug Cutting wrote:

Christoph Goller wrote:

Problem: In TermInfosReader index (every 128th term) skipOffsets are not stored! Due to documentation getIndexOffset returns the offset of the greatest index entry which is less than term. I believe this is not true it may deliver the term itself! If we seek for a term that is in the index, this term and its termInfo will not be read from the enumerator by scanEnum and consequently no skipOffset will be found, even if present. This could lead to serious problems when skipTo is used, couldn´t it?

Yes, this does look like a problem.

Possible Solution: Store skipOffset in *.tii too.

I think that's a good solution. We should change TermInfosWriter.FORMAT from -1 to -2 and then use that to keep SegmentTermEnum.next() back-compatible, since folks may have created indexes with 1.4RC2. The simplest way to do this would be to disable skipTo() when TermInfosWriter.FORMAT is -1, by setting skipInterval to Integer.MAX_VALUE, as is done for 1.3 indexes.

Shall I do this, or would you like to?

I would prefer to leave this task to you :-)

However, I am currently debugging/ stepping through a problem found by Daniel with 1.4rc2. Maybe its caused by a skipTo() bug. I am not sure yet. Maybe its a bug in Conjunction Scorer. If I cannot solve the problem I will post it to the mailing list tonight.

What about the following agreement:
I try to restucture the IndexReader stuff as we already agreed,
you try to solve the skipTo() problem,
and then we review each others work.

Christoph

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: TermDocs.skipTo()

Reply via email to