Re: Reaching max Filter cache limit increases the request latencies.

2020-08-13 Thread Akshay Murarka
Hey Erick,

So I am investigating the point where we can limit the values that are
cached using {!cache=false} (we already use it in some of our cases)
So in general there is 0 evictions on filter cache side but whenever we hit
this max limit there is a spike in evictions as well (which is expected)
As far as I remember not forcing enum on our side, but will definitely
verify that.
My filter cache hit ratio remains constant at around 97.5 %  and even
during this eviction the hit ratio doesn't go down
Regarding other operation there are  a few cases where indexing (80 to 150
docs) also happened during that time but there are also cases where index
happened 5- 10 min after that and the latencies remained high.

Regards,
Akshay


On Thu, Aug 13, 2020 at 5:08 PM Erick Erickson 
wrote:

> Well, when you hit the max capacity, cache entries get aged out and are
> eligible for GC, so GC
> activity increases. But for aging out filterCache entries to be
> noticeable, you have to be
> flushing a _lot_ of them out. Which, offhand, makes me wonder if you’re
> using the filterCache
> appropriately.
>
> Here’s what I’d investigate first: What kinds of fq clauses are you using
> and are they making
> best use of the filterCache? Consider an fq clause like
>
> fq=date_field:[* to NOW]
>
> That will consume
> an entry in the filterCache and never be re-used because NOW is the epoch
> time and will change a millisecond later.
>
> Similarly for fq clauses that contain a lot of values that may vary, for
> instance
>
> fq=id:(1 2 4 86 93 …)
>
> where the list of IDs is not likely to be repeated. Or even repeated in a
> different order.
>
> If you do identify patterns that you _know_ will not be repeated, just add
> fq={!cache=false}your_unrepeated_pattern
>
> What I’m guessing here is that if you’ve correctly identified that the
> filterCache filling up
> is increasing GC activity that much, you must be evicting a _lot_ of fq
> entries very rapidly
> which indicates you’re not repeating fq’s very often.
>
> I should add that the filterCache is also used for some other operations,
> particularly some
> kinds of faceting if you specify the enum method. Are you forcing that?
>
> All that said, I’m also wondering if this is coincidence and your slowdown
> is something
> else. Because given all the work a query does, the additional bookkeeping
> due to
> filterCache churn doesn’t really sound like the culprit. Prior to the
> filterCache filling up,
> what’s your hit ratio? The scenario I can see where the filterCache churn
> could cause
> your response times to go up is if, up until that point, you’re getting a
> high hit ratio that
> goes down after the cache starts aging out entries. I find this rather
> unlikely, but possible.
>
> Best,
> Erick
>
> > On Aug 13, 2020, at 3:19 AM, Akshay Murarka  wrote:
> >
> > Hey guys,
> >
> > So for quite some time we have been facing an issue where whenever the
> Used Filter Cache value reaches the maximum configured value we start
> seeing an increase in the query latencies on solr side.
> > During this time we also see an increase in our garbage collection and
> CPU as well.
> > When a commit happens with openSearcher=true then only the latencies
> value come back to normal.
> >
> > Is there any setting that can help us with this or will increasing the
> max configured value for filter cache help, because right now we can’t
> increase the commit frequency
> >
> > Thanks for the help.
> >
> > Regards,
> > Akshay
> >
> >
> > Below is the graph for request latency
> > 
> >
> >
> >
> >
> >
> > Below is the graph for the Filter cache values
> > 
>
>


Re: Reaching max Filter cache limit increases the request latencies.

2020-08-13 Thread Erick Erickson
Well, when you hit the max capacity, cache entries get aged out and are 
eligible for GC, so GC
activity increases. But for aging out filterCache entries to be noticeable, you 
have to be
flushing a _lot_ of them out. Which, offhand, makes me wonder if you’re using 
the filterCache
appropriately.

Here’s what I’d investigate first: What kinds of fq clauses are you using and 
are they making
best use of the filterCache? Consider an fq clause like 

fq=date_field:[* to NOW]

That will consume 
an entry in the filterCache and never be re-used because NOW is the epoch time 
and will change a millisecond later.

Similarly for fq clauses that contain a lot of values that may vary, for 
instance 

fq=id:(1 2 4 86 93 …)

where the list of IDs is not likely to be repeated. Or even repeated in a 
different order.

If you do identify patterns that you _know_ will not be repeated, just add
fq={!cache=false}your_unrepeated_pattern

What I’m guessing here is that if you’ve correctly identified that the 
filterCache filling up
is increasing GC activity that much, you must be evicting a _lot_ of fq entries 
very rapidly
which indicates you’re not repeating fq’s very often.

I should add that the filterCache is also used for some other operations, 
particularly some
kinds of faceting if you specify the enum method. Are you forcing that?

All that said, I’m also wondering if this is coincidence and your slowdown is 
something
else. Because given all the work a query does, the additional bookkeeping due 
to 
filterCache churn doesn’t really sound like the culprit. Prior to the 
filterCache filling up,
what’s your hit ratio? The scenario I can see where the filterCache churn could 
cause
your response times to go up is if, up until that point, you’re getting a high 
hit ratio that
goes down after the cache starts aging out entries. I find this rather 
unlikely, but possible.

Best,
Erick

> On Aug 13, 2020, at 3:19 AM, Akshay Murarka  wrote:
> 
> Hey guys,
> 
> So for quite some time we have been facing an issue where whenever the Used 
> Filter Cache value reaches the maximum configured value we start seeing an 
> increase in the query latencies on solr side.
> During this time we also see an increase in our garbage collection and CPU as 
> well.
> When a commit happens with openSearcher=true then only the latencies value 
> come back to normal.
> 
> Is there any setting that can help us with this or will increasing the max 
> configured value for filter cache help, because right now we can’t increase 
> the commit frequency
> 
> Thanks for the help.
> 
> Regards,
> Akshay
> 
> 
> Below is the graph for request latency
> 
> 
> 
> 
> 
> 
> Below is the graph for the Filter cache values
>