Whoa!

fq=id(1 OR 2)
is not the same thing at all as
fq=id:1&fq=id:2

Assuming that any document had one and only one ID,  the second clause
would return exactly 0 documents, each and every time.

Multiple fq clauses are essentially set intersections. So the first query is the
set of all documents where id is 1 or 2
the second is the intersection of two sets of documents, one set
with an id of 1 and one with an id of 2. Not the same thing at all.

There's no support for the concept of
(fq=id:1 OR fq=id:2)

Best
Erick

On Tue, Feb 14, 2012 at 2:13 PM, Em <mailformailingli...@yahoo.de> wrote:
> Hi Mikhail,
>
> thanks for kicking in some brainstorming-code!
> The given thread is almost a year old and I was working with Solr in my
> freetime to see where it fails to behave/perform as I expect/wish.
>
> I found out that if you got a lot of different access-patterns for a
> filter-query, you might end up with either a big cache to make things
> fast or with lower performance (impact depends on usecase and
> circumstances).
>
> Scenario:
> You got a permission-field and the client is able to filter by one to
> three permission-values.
> That is:
> fq=foo:user
> fq=foo:moderator
> fq=foo:manager
>
> If you can not control/guarantee the order of the fq's values, you could
> end up with a lot of mess which all returns the same.
>
> Example:
> fq=permission:user OR permission:moderator OR permission:manager
> fq=permission:user OR permission:manager OR permission:moderator
> fq=permission:moderator OR permission:user OR permission:manager
> ...
> They all return the same but where cached seperately which leads to the
> fact that you are wasting memory a lot.
>
> Furthermore, if your access pattern will lead to a lot of different fq's
> on a small set of distinct values, it may make more sense to cache each
> filter-query for itself from a memory-consuming point of view (may cost
> a little bit performance).
>
> That beeing said, if you cache a filter for foo:user, foo:moderator and
> foo:manager you can combine those filters with AND, OR, NOT or whatever
> without recomputing every filter over and over again which would be the
> case if your filter-cache is not large enough.
>
> However, I never compared the performance differences (in terms of
> speed) of a cached filter-query like
> foo:bar OR foo:baz
> With a combination of two cached filter-queries like
> foo:bar
> foo:baz
> combined by a logical OR.
>
> That's how the background looks like.
> Unfortunately I didn't had the time to implement this in the past.
>
> Back to your post:
> Looks like a cool idea and is almost what I had in mind!
>
> I would formulate an easier syntax so that one is able to "parse" each
> fq-clause on its own to cache the CachingWrapperFilter to reuse it again.
>
>> it will use per segment bitset at contrast to Solr's fq which caches for
>> top level reader.
> Could you explain why this bitset would be per-segment based, please?
> I don't see a reason why this *have* to be so.
> What is the benefit you are seeing?
>
> Kind regards,
> Em
>
> Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
>> Hi Em,
>>
>> I briefly read the thread. Are you talking about combing of cached clauses
>> of BooleanQuery, instead of evaluating whole BQ as a filter?
>>
>> I found something like that in API (but only in API)
>> http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)
>>
>> Am I get you right? Why do you need it, btw? If I'm ..
>> I have idea how to do it in two mins:
>>
>> q=+f:text
>> +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 _query_:{!fq}id:4)...
>>
>> Right leg will be a BooleanQuery with SHOULD clauses backed on cached
>> queries (see below).
>>
>> if you are not scarred by the syntax yet you can implement trivial
>> "fq"QParserPlugin, which will be just
>>
>> // lazily through User/Generic Cache
>> q = new FilteredQuery (new MatchAllDocsQuery(), new
>> CachingWrapperFilter(new
>> QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V)))));
>> return q;
>>
>> it will use per segment bitset at contrast to Solr's fq which caches for
>> top level reader.
>>
>> WDYT?
>>
>> On Mon, Feb 13, 2012 at 11:34 PM, Em <mailformailingli...@yahoo.de> wrote:
>>
>>> Hi,
>>>
>>> have a look at:
>>> http://search-lucene.com/m/Z8lWGEiKoI
>>>
>>> I think not much had changed since then.
>>>
>>> Regards,
>>> Em
>>>
>>> Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
>>>> Hi,
>>>>
>>>> how efficent is such an query:
>>>>
>>>> q=some text
>>>> fq=id:(1 OR 2 OR 3...)
>>>>
>>>> Should I better use q:some text AND id:(1 OR 2 OR 3...)?
>>>>
>>>> Is the Filter Cache used for the OR'ed fq?
>>>>
>>>> Thank you
>>>>
>>>>
>>>
>>
>>
>>

Reply via email to