Exactly. I have been watching to see how the new filer interface
works out for 2.0. I am still not certain why it is so involved.
I still think
interface Filter {
boolean include(int doc);
int nextInclude(int doc);
}
should suffice.
On Jul 7, 2006, at 9:53 PM, Yonik Seeley wrote:
This might be even better in conjunction with moving away from BitSet
to some sort of interface like DocNrSkipper... that way you would
never have to combine the filters into a single BitSet.
-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene
search server
On 7/7/06, robert engels <[EMAIL PROTECTED]> wrote:
I implemented it and it works great. I didn't worry about the
deletions since by the time a filter is used the deleted documents
are already removed by the query. The only problem that arose out of
this was for things like the ConstantScoreQuery (which uses a filter)
- I needed to modify this query to ignore deleted documents.
Now I have incremental cached filters - the query performance is
going through the roof.
On Jul 7, 2006, at 2:47 PM, Chris Hostetter wrote:
>
> I'm no segments/MultiReader expert, but your idea sounds good to
> me ... it
> seems like it would certainly work in the "new segments" situation.
>
> One thing i don't see you mention is dealing with deletions ... i'm
> not
> sure if deleting documents cause the version number of an
> IndexReader to
> change or not (if it does your job is easy) but even if it
doesn't I'm
> guessing you could say that if hasDeletions() returns true, you
> have to
> assume you need to invalidate your cached bits (worst case scenerio
> you
> are invalidating the cache as often as it is now)
>
>
> : Date: Fri, 7 Jul 2006 00:32:54 -0500
> : From: robert engels <[EMAIL PROTECTED]>
> : Reply-To: java-dev@lucene.apache.org
> : To: Lucene-Dev <java-dev@lucene.apache.org>
> : Subject: MultiSegmentQueryFilter enhancement for interactive
> indexes?
> :
> : I thought of a possible enhancement - before I go down the road,
> I am
> : looking for some input form the community?
> :
> : Currently, the QueryFilter caches the bits base upon the
> IndexReader.
> :
> : The problem with this is small incremental changes to the index
> : invalidate the cache.
> :
> : What if instead the filter determined that the underlying
> IndexReader
> : was a MultiReader and then maintained a bitset for each reader,
> : combining them in bits() when requested. The filter could
check if
> : any of the underlying readers were the different (removed or
added)
> : and then just create a new bitset for that reader. With the
new non-
> : bit set filter implementations this could be even more memory
> : efficient since the bitsets would not need to be combined into a
> : single bitset.
> :
> : With the previous work on "reopen" so that segments are
reused, this
> : would allow filters to be far more useful in a highly interactive
> : environment.
> :
> : What do you think?
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]