On Thursday 26 January 2006 20:08, Chris Hostetter wrote:
> 
> The subject of revamping the Filter API to support more compact filter
> representations has come up in the past ... At least one patch comes to
> mind that helps with the issue...
> 
>    https://issues.apache.org/jira/browse/LUCENE-328
> 
> ...i'm not intimitely familiar with that code, but if i recall correctly
> from the last time i read it, it doesn't propose any actual API changes
> just some utilities to reduce memory usage.
> 
> Reading your post has me thinking about this whole issue again,
> particularly the subject of Filters that are straight forward enough they
> could be implimented as simple iterators with very little state and what
> API changes could be made to support the interface you describe and still
> be backwards compatible.
> 
> One thing that comes to mind (that i don't remember suggesting before, but
> perhaps someone else has suggested it before) is that since Filter is an
> bastract class which people arecurrently required to subclass, we could
> follow a migration path something like this...
> 
>   1) add a SearchFilter interface like the one you describe to the core
>      code base
>   2) add the following method declaration to the Filter class...
>         public SearchFilter getSearchFilter(IndexReader) throws IOException
>      ...impliment this method by calling bits, and returning an instance
>      of a thin inner class that wraps the BitSet

This is done in the FilteredQuery referred to above in the above reference.
The wrapper might take a small performance hit.

>   3) indicate that Filter.bits() is deprecated.
>   4) change all existing calls to Filter.bits() in the core lucene code
>      base to call Filter.getSearchFilter and do whatever iterating is
>      neccessary.
>   5) gradually reimpliment all of the concrete instances of Filter in
>      the core lucene code base so they override the getSearchFilter method
>      with something that returns a more "iterator" style SearchFilter,
>      and impliment their bits() method to use the SearchFilter to build up
>      the bit set if clients call it directly.
>   6) wait a suitable amount of time.
>   7) remove Filter.bits() and all of the concrete implimentations from the
>      lucene core.

Sounds feasible to me, provided the performance hit is small enough.

> ...i think that would be a fairly straight forward and practical way to
> execute such a change.  The big question in my mind is what the
> "SearchFilter" interface should look like.  what you propose is along the
> usage lines of "iterate over your ScoreDocs, and foreach one test it
> against hte filter" ... but i'm not convinced that it wouldnt' make more
> sense to say "ask the filter what the next viable doc is, now score it",
> ala...
> 
>       public interface SearchFilter {
>           /** returns doc ids that pass the filter, in increasing order.
>            * returns 0 once there are no more docs.
>            */
>           int doc getNextFilteredDoc();
>       }
> 
> 
> thoughts?

For search speed one needs to know the next filtered document, much
like BitSet.nextSetBit().  See DocNrSkipper in the issue referred to above.

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to