Re: json.facet floods the filterCache

2020-10-26 Thread Michael Gibney
Damien, I gathered that you're using "nested facet"; but there are a
lot of different ways to do that, with different implications. e.g.,
nesting terms facet within terms facet, query facet within terms,
terms within query, different stats, sorting, overrequest/overrefine
(and for that matter, refine:none|simple, or even distirbuted vs.
non-distributed), etc. I was wondering if you could share an example
of an actual json facet specification.

Pending more information, I can say that I've been independently
looking into this also. I think high filterCache usage can result if
you're using terms faceting that results in a lot of refinement
requests (either a high setting for overrefine, or
low/unevenly-distributed facet counts (as might happen with
high-cardinality fields). I think nested terms could also magnify the
effect of high-cardinality fields, increasing the number of buckets
needing refinement. You could see if setting refine:none helps (though
of course it could have undesirable effects on the actual results).
But afaict every term specified in a refinement request currently hits
the filterCache:
https://github.com/apache/lucene-solr/blob/40e2122/solr/core/src/java/org/apache/solr/search/facet/FacetProcessor.java#L418

A word of caution regarding the JSON facet `cacheDf` param: although
it's currently undocumented in the refGuide, I believe it's only
respected at all in FacetFieldProcessorByEnumTermsStream, which is
only invoked under certain circumstances (and only when sort=index).
So this is unlikely to help (though it's impossible to say without
more specific information about the actual requests you're trying to
run).

Michael

On Fri, Oct 23, 2020 at 12:52 AM  wrote:
>
> Im dong a nested facet (
> https://lucene.apache.org/solr/guide/8_6/json-facet-api.html#nested-facets)
> or sub-facets, and am using the 'terms' facet.
>
> Digging around more looks like I can set 'cacheDf=-1' to disable the use of
> the cache.
>
> On Fri, 23 Oct 2020 at 00:14, Michael Gibney 
> wrote:
>
> > Damien,
> > Are you able to share the actual json.facet request that you're using
> > (at least just the json.facet part)? I'm having a hard time being
> > confident that I'm correctly interpreting when you say "a json.facet
> > query on nested facets terms".
> > Michael
> >
> > On Thu, Oct 22, 2020 at 3:52 AM Christine Poerschke (BLOOMBERG/
> > LONDON)  wrote:
> > >
> > > Hi Damien,
> > >
> > > You mention about JSON term facets, I haven't explored w.r.t. that but
> > we have observed what you describe for JSON range facets and I've started
> > https://issues.apache.org/jira/browse/SOLR-14939 about it.
> > >
> > > Hope that helps.
> > >
> > > Regards,
> > > Christine
> > >
> > > From: solr-user@lucene.apache.org At: 10/22/20 01:07:59To:
> > solr-user@lucene.apache.org
> > > Subject: json.facet floods the filterCache
> > >
> > > Hi,
> > >
> > > I'm using a json.facet query on nested facets terms and am seeing very
> > high
> > > filterCache usage. Is it possible to somehow control this? With a fq it's
> > > possible to specify fq={!cache=false}... but I don't see a similar thing
> > > json.facet.
> > >
> > > Kind regards,
> > > Damien
> > >
> > >
> >


Re: json.facet floods the filterCache

2020-10-22 Thread damienk
Im dong a nested facet (
https://lucene.apache.org/solr/guide/8_6/json-facet-api.html#nested-facets)
or sub-facets, and am using the 'terms' facet.

Digging around more looks like I can set 'cacheDf=-1' to disable the use of
the cache.

On Fri, 23 Oct 2020 at 00:14, Michael Gibney 
wrote:

> Damien,
> Are you able to share the actual json.facet request that you're using
> (at least just the json.facet part)? I'm having a hard time being
> confident that I'm correctly interpreting when you say "a json.facet
> query on nested facets terms".
> Michael
>
> On Thu, Oct 22, 2020 at 3:52 AM Christine Poerschke (BLOOMBERG/
> LONDON)  wrote:
> >
> > Hi Damien,
> >
> > You mention about JSON term facets, I haven't explored w.r.t. that but
> we have observed what you describe for JSON range facets and I've started
> https://issues.apache.org/jira/browse/SOLR-14939 about it.
> >
> > Hope that helps.
> >
> > Regards,
> > Christine
> >
> > From: solr-user@lucene.apache.org At: 10/22/20 01:07:59To:
> solr-user@lucene.apache.org
> > Subject: json.facet floods the filterCache
> >
> > Hi,
> >
> > I'm using a json.facet query on nested facets terms and am seeing very
> high
> > filterCache usage. Is it possible to somehow control this? With a fq it's
> > possible to specify fq={!cache=false}... but I don't see a similar thing
> > json.facet.
> >
> > Kind regards,
> > Damien
> >
> >
>


Re: json.facet floods the filterCache

2020-10-22 Thread Michael Gibney
Damien,
Are you able to share the actual json.facet request that you're using
(at least just the json.facet part)? I'm having a hard time being
confident that I'm correctly interpreting when you say "a json.facet
query on nested facets terms".
Michael

On Thu, Oct 22, 2020 at 3:52 AM Christine Poerschke (BLOOMBERG/
LONDON)  wrote:
>
> Hi Damien,
>
> You mention about JSON term facets, I haven't explored w.r.t. that but we 
> have observed what you describe for JSON range facets and I've started 
> https://issues.apache.org/jira/browse/SOLR-14939 about it.
>
> Hope that helps.
>
> Regards,
> Christine
>
> From: solr-user@lucene.apache.org At: 10/22/20 01:07:59To:  
> solr-user@lucene.apache.org
> Subject: json.facet floods the filterCache
>
> Hi,
>
> I'm using a json.facet query on nested facets terms and am seeing very high
> filterCache usage. Is it possible to somehow control this? With a fq it's
> possible to specify fq={!cache=false}... but I don't see a similar thing
> json.facet.
>
> Kind regards,
> Damien
>
>