Re: Cached fq decreases performance

Jeff Wartes Thu, 03 Sep 2015 14:18:05 -0700

I’m measuring performance in the aggregate, over several minutes and tens
of thousands of distinct queries that all use this specific fq.
The cache hit count reported is roughly identical to the number of queries
I’ve sent, so no, this isn’t a first-query cache-miss situation.


The fq result will be large, 15% of my documents qualify, so if solr is
intelligent enough to ignore that restriction in the main query until it’s
found a much smaller set to scan for that criteria, I could see how simply
processing the intersection of the full fq cache value could be time
consuming. Is that the kind of thing you’re talking about with
intersection hopping?


On 9/3/15, 2:00 PM, "Alexandre Rafalovitch" <arafa...@gmail.com> wrote:

>FQ has to calculate the result bit set for every document to be able
>to cache it. Q will only calculate it for the documents it matches on
>and there is some intersection hopping going on.
>
>Are you seeing this performance hit on first query only or or every
>one? I would expect on first query only unless your filter cache size
>assumptions are somehow wrong.
>
>Regards,
>   Alex.
>
>----
>Solr Analyzers, Tokenizers, Filters, URPs and even a newsletter:
>http://www.solr-start.com/
>
>
>On 3 September 2015 at 16:45, Jeff Wartes <jwar...@whitepages.com> wrote:
>>
>> I have a query like:
>>
>> q=<some complicated stuff>&fq=enabled:true
>>
>> For purposes of this conversation, "fq=enabled:true" is set for every
>>query, I never open a new searcher, and this is the only fq I ever use,
>>so the filter cache size is 1, and the hit ratio is 1.
>> The fq=enabled:true clause matches about 15% of my documents. I have
>>some 20M documents per shard, in a 5.3 solrcloud cluster.
>>
>> Under these circumstances, this alternate version of the query averages
>>about 1/3 faster, consumes less CPU, and generates less garbage:
>>
>> q=<some complicated stuff> +enabled:true
>>
>> So it appears I have a case where using the cached fq result is more
>>expensive than just putting the same restriction in the query.
>> Does someone have a clear mental model of how “q” and “fq” interact?
>> Naively, I’d expect that either the “q” operates within the set matched
>>by the fq (in which case it’s doing "complicated stuff" on only a subset
>>and should be faster) or that Solr takes the intersection of the q & fq
>>sets (in which case putting the restriction in the “q” means that set
>>needs to be generated instead of retrieved from cache, and should be
>>slower).
>> This has me wondering, if you want fq cache speed boosts, but also want
>>ranking involved, can you do that? Would something like q=<some other
>>stuff> AND <particular query>&fq=<particular query> help, or just be
>>more work?
>>
>> Thanks.

Re: Cached fq decreases performance

Reply via email to