Re: TermDocs.skipTo()

Doug Cutting Thu, 08 Apr 2004 11:17:37 -0700

Christoph Goller wrote:

Daniel found a bug today and therefore I reviewed skipTo once again.

Thanks!

Here are some further things to consider:

*) MultiTermDocs.skipTo could easily be optimized too, couldn\x{00B4}t it?

Yes, I think so. I think I forgot to look at that .

*) SegmentTermDocs: skipStream never closed

You're right, it should be.

*) SegmentTermPositions: seek(Terminfo): probably should always make
proxCount = 0;

Right again.

I can't think why I ever did it that way... It was done as a fix for:

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=6292

http://cvs.apache.org/viewcvs.cgi/jakarta-lucene/src/java/org/apache/lucene/index/SegmentTermPositions.java?r1=1.2&r2=1.3

*) I think due to your last changes SegmentTermDocs makes one skip less than is required? However, I haven´t tested this.

while (target > skipDoc && skipCount < numSkips) {
        lastSkipDoc = skipDoc;
        lastFreqPointer = freqPointer;
        lastProxPointer = proxPointer;

        if (skipDoc != 0 && skipDoc >= doc)
          numSkipped += skipInterval;

        skipDoc += skipStream.readVInt();
        freqPointer += skipStream.readVInt();
        proxPointer += skipStream.readVInt();

        skipCount++;
      }

      // if we found something to skip, then skip it
      if (lastFreqPointer > freqStream.getFilePointer()) {
        freqStream.seek(lastFreqPointer);
        skipProx(lastProxPointer);

        doc = lastSkipDoc;
        count += numSkipped;
      }

Consider exit of while because of skipCount == numSkips. Then doc becomes lastSkipDoc not skipDoc!

That sounds reasonable. I'm sure having trouble getting this method right! So do you think this loop should be changed to something like:

  while (target > skipDoc) {
    lastSkipDoc = skipDoc;
    ...

    if (skipCount > numSkips)
      break;

    skipDoc += skipStream.readVInt();
    ...
   }

That looks better to me... What do you think?

*) PhraseScorer.skipTo jumps one doc too far because of call to sort() which calls next for each PhrasePosition. Here is Daniels test that demonstrates this: [ ... ]

Instead of 1 hit, 0 hits are found with 1.4rc2, while 1.3 finds the hit. I committed the necessary change to PhraseScorer already and it fixes the problem.

Thanks! If you have a chance, please add this as a unit test too.

Unfortunately, I haven´t found the time to restructure the IndexReaders so far.

Thanks again for all your work. You're helping to make Lucene much more reliable!

Cheers,

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: TermDocs.skipTo()

Reply via email to