Re: OR-FilterQuery

Em Tue, 14 Feb 2012 11:13:59 -0800

Hi Mikhail,

thanks for kicking in some brainstorming-code!
The given thread is almost a year old and I was working with Solr in my
freetime to see where it fails to behave/perform as I expect/wish.

I found out that if you got a lot of different access-patterns for a
filter-query, you might end up with either a big cache to make things
fast or with lower performance (impact depends on usecase and
circumstances).

Scenario:
You got a permission-field and the client is able to filter by one to
three permission-values.
That is:
fq=foo:user
fq=foo:moderator
fq=foo:manager

If you can not control/guarantee the order of the fq's values, you could
end up with a lot of mess which all returns the same.

Example:
fq=permission:user OR permission:moderator OR permission:manager
fq=permission:user OR permission:manager OR permission:moderator
fq=permission:moderator OR permission:user OR permission:manager
...
They all return the same but where cached seperately which leads to the
fact that you are wasting memory a lot.

Furthermore, if your access pattern will lead to a lot of different fq's
on a small set of distinct values, it may make more sense to cache each
filter-query for itself from a memory-consuming point of view (may cost
a little bit performance).

That beeing said, if you cache a filter for foo:user, foo:moderator and
foo:manager you can combine those filters with AND, OR, NOT or whatever
without recomputing every filter over and over again which would be the
case if your filter-cache is not large enough.

However, I never compared the performance differences (in terms of
speed) of a cached filter-query like
foo:bar OR foo:baz
With a combination of two cached filter-queries like
foo:bar
foo:baz
combined by a logical OR.

That's how the background looks like.
Unfortunately I didn't had the time to implement this in the past.

Back to your post:
Looks like a cool idea and is almost what I had in mind!

I would formulate an easier syntax so that one is able to "parse" each
fq-clause on its own to cache the CachingWrapperFilter to reuse it again.

> it will use per segment bitset at contrast to Solr's fq which caches for
> top level reader.
Could you explain why this bitset would be per-segment based, please?
I don't see a reason why this *have* to be so.
What is the benefit you are seeing?

Kind regards,
Em

Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
> Hi Em,
> 
> I briefly read the thread. Are you talking about combing of cached clauses
> of BooleanQuery, instead of evaluating whole BQ as a filter?
> 
> I found something like that in API (but only in API)
> http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)
> 
> Am I get you right? Why do you need it, btw? If I'm ..
> I have idea how to do it in two mins:
> 
> q=+f:text
> +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)...
> 
> Right leg will be a BooleanQuery with SHOULD clauses backed on cached
> queries (see below).
> 
> if you are not scarred by the syntax yet you can implement trivial
> "fq"QParserPlugin, which will be just
> 
> // lazily through User/Generic Cache
> q = new FilteredQuery (new MatchAllDocsQuery(), new
> CachingWrapperFilter(new
> QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V)))));
> return q;
> 
> it will use per segment bitset at contrast to Solr's fq which caches for
> top level reader.
> 
> WDYT?
> 
> On Mon, Feb 13, 2012 at 11:34 PM, Em <mailformailingli...@yahoo.de> wrote:
> 
>> Hi,
>>
>> have a look at:
>> http://search-lucene.com/m/Z8lWGEiKoI
>>
>> I think not much had changed since then.
>>
>> Regards,
>> Em
>>
>> Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
>>> Hi,
>>>
>>> how efficent is such an query:
>>>
>>> q=some text
>>> fq=id:(1 OR 2 OR 3...)
>>>
>>> Should I better use q:some text AND id:(1 OR 2 OR 3...)?
>>>
>>> Is the Filter Cache used for the OR'ed fq?
>>>
>>> Thank you
>>>
>>>
>>
> 
> 
>

Re: OR-FilterQuery

Reply via email to