Re: Does anyone have tips on managing cached filters?

Arjen van der Meijden Thu, 29 Nov 2012 23:01:10 -0800

We have something similar with documens that can be tagged (and havemany other relations). But for the matter of search we have twodistinctions from your aproach:- We do actually index the relation's id (i.e. the tag's id) as part ofthe lucene-document and update the document if that relation betweentheitem and a tag is changed. So a filter on some 'tag' becomes a trivialtermsFilter.addTerm('tagId', '12345).- We use Lucene only as a base of the results we're going to send backto the user. I.e. we get results from Lucene and than do some moreprocessing on them.

But that last distinction is actually because we started with anin-memory "database" application that did basically what Lucene alreadydoes, but just with more complicated objects and more complicatedfacet-extraction, more complicated filters, etc. So Lucene is only usedwhen we need keyword-filtering and we help Lucene do that quickly byoffering some Filters derived from the rest of the application's work.And yes, if we were to redesign the application, it could becomedifferent :P


Best regards,

Arjen

On 29-11-2012 6:57 Trejkaz wrote:

On Wed, Nov 28, 2012 at 6:28 PM, Robert Muir <rcm...@gmail.com> wrote:

My point is really that lucene (especially clear in 4.0) assumes
indexreaders are immutable points in time. I don't think it makes sense for
us to provide any e.g. filtercaching or similar otherwise, because this is
a key simplification to the design. If you depart from this, by scoring or
filtering from mutable stuff outside the inverted index, things are likely
going to get complicated.


Whereas it would be lovely to live in a land of rainbows and unicorns
where all the data you ever want to use is in the text index and all
filters can be written as a query, that simply isn't the case for us
and I very much doubt we're not the only ones in this situation.

Sure, things are complicated. Anything except the most trivial forum
search application is complicated.

Well, the situation as it stands now is that when a filter is
invalidated, it happens across all stores which are currently open.
That means that results are at least correct, but after invalidating a
filter, a little more work than necessary is required to populate the
cache again. For certain filters (like word lists) this is necessary
anyway, since adding a word might invalidate any store. For others
like tags, I was hoping there would be some way to selectively
invalidate only certain readers. But it seems like that isn't the
case, so I will probably have to add a third level of caching to cache
these sorts of filter per-store instead of globally.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: Does anyone have tips on managing cached filters?

Reply via email to