BTW, you're not the first person who would like this capability, see:
https://issues.apache.org/jira/browse/SOLR-1223

But the fact that this JIRA was originally opened in in June of 2009
and hasn't been implemented yet indicates that it's not  super-high
priority.

Best
Erick

On Tue, Feb 14, 2012 at 4:33 PM, Erick Erickson <erickerick...@gmail.com> wrote:
> Whoa!
>
> fq=id(1 OR 2)
> is not the same thing at all as
> fq=id:1&fq=id:2
>
> Assuming that any document had one and only one ID,  the second clause
> would return exactly 0 documents, each and every time.
>
> Multiple fq clauses are essentially set intersections. So the first query is 
> the
> set of all documents where id is 1 or 2
> the second is the intersection of two sets of documents, one set
> with an id of 1 and one with an id of 2. Not the same thing at all.
>
> There's no support for the concept of
> (fq=id:1 OR fq=id:2)
>
> Best
> Erick
>
> On Tue, Feb 14, 2012 at 2:13 PM, Em <mailformailingli...@yahoo.de> wrote:
>> Hi Mikhail,
>>
>> thanks for kicking in some brainstorming-code!
>> The given thread is almost a year old and I was working with Solr in my
>> freetime to see where it fails to behave/perform as I expect/wish.
>>
>> I found out that if you got a lot of different access-patterns for a
>> filter-query, you might end up with either a big cache to make things
>> fast or with lower performance (impact depends on usecase and
>> circumstances).
>>
>> Scenario:
>> You got a permission-field and the client is able to filter by one to
>> three permission-values.
>> That is:
>> fq=foo:user
>> fq=foo:moderator
>> fq=foo:manager
>>
>> If you can not control/guarantee the order of the fq's values, you could
>> end up with a lot of mess which all returns the same.
>>
>> Example:
>> fq=permission:user OR permission:moderator OR permission:manager
>> fq=permission:user OR permission:manager OR permission:moderator
>> fq=permission:moderator OR permission:user OR permission:manager
>> ...
>> They all return the same but where cached seperately which leads to the
>> fact that you are wasting memory a lot.
>>
>> Furthermore, if your access pattern will lead to a lot of different fq's
>> on a small set of distinct values, it may make more sense to cache each
>> filter-query for itself from a memory-consuming point of view (may cost
>> a little bit performance).
>>
>> That beeing said, if you cache a filter for foo:user, foo:moderator and
>> foo:manager you can combine those filters with AND, OR, NOT or whatever
>> without recomputing every filter over and over again which would be the
>> case if your filter-cache is not large enough.
>>
>> However, I never compared the performance differences (in terms of
>> speed) of a cached filter-query like
>> foo:bar OR foo:baz
>> With a combination of two cached filter-queries like
>> foo:bar
>> foo:baz
>> combined by a logical OR.
>>
>> That's how the background looks like.
>> Unfortunately I didn't had the time to implement this in the past.
>>
>> Back to your post:
>> Looks like a cool idea and is almost what I had in mind!
>>
>> I would formulate an easier syntax so that one is able to "parse" each
>> fq-clause on its own to cache the CachingWrapperFilter to reuse it again.
>>
>>> it will use per segment bitset at contrast to Solr's fq which caches for
>>> top level reader.
>> Could you explain why this bitset would be per-segment based, please?
>> I don't see a reason why this *have* to be so.
>> What is the benefit you are seeing?
>>
>> Kind regards,
>> Em
>>
>> Am 14.02.2012 19:33, schrieb Mikhail Khludnev:
>>> Hi Em,
>>>
>>> I briefly read the thread. Are you talking about combing of cached clauses
>>> of BooleanQuery, instead of evaluating whole BQ as a filter?
>>>
>>> I found something like that in API (but only in API)
>>> http://lucene.apache.org/solr/api/org/apache/solr/search/ExtendedQuery.html#setCacheSep(boolean)
>>>
>>> Am I get you right? Why do you need it, btw? If I'm ..
>>> I have idea how to do it in two mins:
>>>
>>> q=+f:text
>>> +(_query_:{!fq}id:1 _query_:{!fq}id:2 _query_:{!fq}id:3 
>>> _query_:{!fq}id:4)...
>>>
>>> Right leg will be a BooleanQuery with SHOULD clauses backed on cached
>>> queries (see below).
>>>
>>> if you are not scarred by the syntax yet you can implement trivial
>>> "fq"QParserPlugin, which will be just
>>>
>>> // lazily through User/Generic Cache
>>> q = new FilteredQuery (new MatchAllDocsQuery(), new
>>> CachingWrapperFilter(new
>>> QueryWrapperFilter(subQuery(localParams.get(QueryParsing.V)))));
>>> return q;
>>>
>>> it will use per segment bitset at contrast to Solr's fq which caches for
>>> top level reader.
>>>
>>> WDYT?
>>>
>>> On Mon, Feb 13, 2012 at 11:34 PM, Em <mailformailingli...@yahoo.de> wrote:
>>>
>>>> Hi,
>>>>
>>>> have a look at:
>>>> http://search-lucene.com/m/Z8lWGEiKoI
>>>>
>>>> I think not much had changed since then.
>>>>
>>>> Regards,
>>>> Em
>>>>
>>>> Am 13.02.2012 20:17, schrieb spr...@gmx.eu:
>>>>> Hi,
>>>>>
>>>>> how efficent is such an query:
>>>>>
>>>>> q=some text
>>>>> fq=id:(1 OR 2 OR 3...)
>>>>>
>>>>> Should I better use q:some text AND id:(1 OR 2 OR 3...)?
>>>>>
>>>>> Is the Filter Cache used for the OR'ed fq?
>>>>>
>>>>> Thank you
>>>>>
>>>>>
>>>>
>>>
>>>
>>>

Reply via email to