Re: RE : TR : Possible Bug with MultiSearcher?

Doug Cutting Thu, 19 Sep 2002 14:38:24 -0700

Rasik Pandey wrote:
> Developers,
> Attached is the diff for MultiSearcher which seems to correct these
> bugs. I have not yet found any problems caused by these changes in
> testing........but we will keep you informed!

[ ... ]

diff -w -r1.4 MultiSearcher.java
96,98c100,105
<   public final Document doc(int n) throws IOException {
<     int i = searcherIndex(n);                   // find searcher index
<     return searchers[i].doc(n - starts[i]);
---
 >   public Document doc(int n) throws IOException {
 >     int i = searcherIndex(n);
 >     if (i < 0 || i > searchables.length)
 >         return null;
 >     else
 >       return searchables[i].doc(n - starts[i]);

I don't agree with this change.  If you pass in a bogus document number 
then an exception should be thrown here.  ArrayIndexOutOfBounds is not 
the friendliest of exceptions, but it's better than returning null.

> 113c120
> <       else if (n > midValue)
> ---
> 
>>      else if (n >= midValue) //made this a ">=", to ensure that a valid index is 
>found

This is the standard binary search algorithm, and should not be changed 
lightly (according to Knuth, no less).  The problem is when there are 
multiple entries with the same value then this algorithm doesn't always 
return the index of the largest one, which is required in this case.  So 
I think the fix is to scan for the largest value:

--- MultiSearcher.java.~1.5.~   2002-07-17 10:28:56.000000000 -0700
+++ MultiSearcher.java  2002-09-19 15:02:02.000000000 -0700
@@ -116,9 +116,13 @@
        hi = mid - 1;
        else if (n > midValue)
        lo = mid + 1;
-      else
+      else {
+        while (mid+1 < searchables.length && starts[mid+1] == midValue) {
+          mid++;                                  // return last match
+        }
        return mid;
      }
+    }
      return hi;
    }

That passes the test suite just checked in too!

Any objections to this?

Doug


--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
Re: RE : TR : Possible Bug with MultiSearcher?

Reply via email to