Martin Sevigny wrote:
Looking (and testing) range queries (in particular date interval
queries), I think there is a problem with how Lucene handles filters
when a multisearcher is used.

I think this works correctly. If you are convinced there is a bug, please submit a test case.


Basically, a Filter is created from a Reader, so only one searchable
index. It creates a BitSet for documents in this index that matches the
filter criteria. Fine.

Then, the Searchable interface has a search(Query query, Filter filter,
HitCollector results) method. So if a call is made to this method
through a MultiSearcher object, then we have code (in the search method)
such as:

for (int i = 0; i < searchables.length; i++) {

final int start = starts[i];

      searchables[i].search(query, filter, new HitCollector() {
          public void collect(int doc, float score) {
            results.collect(doc + start, score);
          }
        });

}

Here, clearly, the search is performed in each Searchable with the same
Filter. This filter has been created with (one must assume) one of the
Searchables, so the BitSet is probably wrong for the others, resulting
in wrong results.

Is taht right?

No. The Filter.bits(IndexReader) method has not yet been called at this point. That method is called by the Searchable.search() implementation. that is called here. So, with a MultiSearcher, a different bit vector is created for each of the searchers contained.


If you are concerned about the cost of re-computing the bit vector for each query and index, please look at the QueryFilter class. It caches filters for indexes. Even if it does not provide exactly what you need, you could model your own filter implementation after this.

Doug


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to