We have something similar with documens that can be tagged (and have many other relations). But for the matter of search we have two distinctions from your aproach: - We do actually index the relation's id (i.e. the tag's id) as part of the lucene-document and update the document if that relation betweenthe item and a tag is changed. So a filter on some 'tag' becomes a trivial termsFilter.addTerm('tagId', '12345). - We use Lucene only as a base of the results we're going to send back to the user. I.e. we get results from Lucene and than do some more processing on them.

But that last distinction is actually because we started with an in-memory "database" application that did basically what Lucene already does, but just with more complicated objects and more complicated facet-extraction, more complicated filters, etc. So Lucene is only used when we need keyword-filtering and we help Lucene do that quickly by offering some Filters derived from the rest of the application's work. And yes, if we were to redesign the application, it could become different :P

Best regards,

Arjen

On 29-11-2012 6:57 Trejkaz wrote:
On Wed, Nov 28, 2012 at 6:28 PM, Robert Muir <rcm...@gmail.com> wrote:
My point is really that lucene (especially clear in 4.0) assumes
indexreaders are immutable points in time. I don't think it makes sense for
us to provide any e.g. filtercaching or similar otherwise, because this is
a key simplification to the design. If you depart from this, by scoring or
filtering from mutable stuff outside the inverted index, things are likely
going to get complicated.

Whereas it would be lovely to live in a land of rainbows and unicorns
where all the data you ever want to use is in the text index and all
filters can be written as a query, that simply isn't the case for us
and I very much doubt we're not the only ones in this situation.

Sure, things are complicated. Anything except the most trivial forum
search application is complicated.

Well, the situation as it stands now is that when a filter is
invalidated, it happens across all stores which are currently open.
That means that results are at least correct, but after invalidating a
filter, a little more work than necessary is required to populate the
cache again. For certain filters (like word lists) this is necessary
anyway, since adding a word might invalidate any store. For others
like tags, I was hoping there would be some way to selectively
invalidate only certain readers. But it seems like that isn't the
case, so I will probably have to add a third level of caching to cache
these sorts of filter per-store instead of globally.

TX

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to