Re: Lucene Queries Over User-Editable Dynamic Categories of Documents

mark harwood Thu, 25 Oct 2007 03:00:38 -0700

There are 2 considerations when caching filter results:

1) What was the criteria used to produce the results?
2) What version of the index were these results taken from?


CachingWrapperFilter takes care of 2) by using WeakHashMap keyed on IndexReader.

The filter you pass to CachingWrapperFilter must play it's part in taking care 
of 1) by implementing hashcode/equals.

You can then maintain an LRU hashmap of CachingWrapperFilters to keep only the 
most popular filters.

Incidentally, contrib's XMLQueryParser handles all this for you with a simple 
"CachedFilter" tag:

<FilteredQuery>
    <Query>
        <UserQuery>"Brittany Spears"</UserQuery>
    </Query>    
    <Filter>
        <CachedFilter>
            <RangeFilter fieldName="date" lowerTerm="19970409" 
upperTerm="19970412"/>
        </CachedFilter>
    </Filter>    
</FilteredQuery>


Cheers
Mark

----- Original Message ----
From: lucene user <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Thursday, 25 October, 2007 10:36:14 AM
Subject: Re: Lucene Queries Over User-Editable Dynamic Categories of Documents

What do you means by 'Most caches are held in WeakHashMap...' is this
caching provided by CachingWrappingFilter or do we have to implement it
ourselves? I assume the former.

We will share results of our testing as soon as we have any - not sure
 how
generalizable they will be.

You have been super helpful! Very grateful! Thanks!

On 10/24/07, markharw00d <[EMAIL PROTECTED]> wrote:
>
> lucene user wrote:
> > Thanks for all your help!
> >
> > We are using Lucene 2.1.0 and TermsFilter seems to be new in Lucene
> 2.2.0.
> > I have not been able to find SortedVIntList in the javadocs at all.
> >
>
> No, SortedVIntList is in the patch I provided a link to earlier.
>
>
> > Because both SortedVIntList and a regular BitSet are based on
 Lucene
> > Document Numbers, which are not permanent, It seems we will need to
> > generate these objects fresh at least once per session. Any
 comments,
> > about that? Do I have that correct?
> >
> Yes. Most caches tend to be held in WeakHashMap keyed on IndexReader
 so
> that when a new reader takes over old caches are automatically
 garbage
> collected.
>
>
> > Our application includes the following filter implementation that
 we use
> for
> > a
> > slightly different end user category problem. We could easily use
 it for
> our
> > current problem as well.
> >
> > Is TermsFilter sufficiently better (faster, more compact, more
 correct,
> > etc.) to make upgrading
> > very important?
> >
> TermsFilter is in "contrib" and is stand-alone so should work with
 most
> Lucene versions.
> Your implementation looks to scan the whole termEnum whereas
 TermsFilter
> looks up only the selected terms using reader.termDocs(term).
> Benchmarking will tell you which is faster. I'd be interested to know
> the results.
>
> Cheers
> Mark
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>





      ___________________________________________________________ 
Want ideas for reducing your carbon footprint? Visit Yahoo! For Good  
http://uk.promotions.yahoo.com/forgood/environment.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Lucene Queries Over User-Editable Dynamic Categories of Documents

Reply via email to