Re: Synonym expansions w/ phrase slop exhausting memory after upgrading to SOLR 7

Michael Gibney Wed, 18 Dec 2019 08:10:51 -0800

This is related to this issue:
https://issues.apache.org/jira/browse/SOLR-13336


Also tangentially relevant:
https://issues.apache.org/jira/browse/LUCENE-8531
https://issues.apache.org/jira/browse/SOLR-12243

I think your options include:
1. setting slop=0, which restores SpanNearQuery as the graph phrase
query implementation (see LUCENE-8531)
2. downgrading to 7.5 would avoid the OOM, but would cause graph
phrase queries to be effectively ignored (see SOLR-12243)
3. upgrade to 8.0, which will restore the failsafe maxBooleanClauses,
avoiding OOM but returning an error code for affected queries (which
in your case sounds like most queries?) (see SOLR-13336)

Michael

On Tue, Dec 17, 2019 at 4:16 PM Nick D <ndrake0...@gmail.com> wrote:
>
> Hello All,
>
> We recently upgraded from Solr 6.6 to Solr 7.7.2 and recently had spikes in
> memory that eventually caused either an OOM or almost 100% utilization of
> the available memory. After trying a few things, increasing the JVM heap,
> making sure docValues were set for all Sort, facet fields (thought maybe
> the fieldCache was blowing up), I was able to isolate a single query that
> would cause the used memory to become fully exhausted and effectively
> render the instance dead. After applying a timeAllowed  value to the query
> and reducing the query phrase (system would crash on without throwing the
> warning on longer queries containing synonyms). I was able to idenitify the
> following warning in the logs:
>
> o.a.s.s.SolrIndexSearcher Query: <____very long synonym expansion____>
>
> the request took too long to iterate over terms. Timeout: timeoutAt:
> 812182664173653 (System.nanoTime(): 812182715745553),
> TermsEnum=org.apache.lucene.codecs.blocktree.SegmentTermsEnum@7a0db441
>
> I have narrowed the problem down to the following:
> the way synonyms are being expaneded along with phrase slop.
>
> With a ps=5 I get 4096 possible permutations of the phrase being searched
> with because of synonyms, looking similar to:
> ngs_title:"bereavement leave type build bereavement leave type data p"~5
>  ngs_title:"bereavement leave type build bereavement bereavement type data
> p"~5
>  ngs_title:"bereavement leave type build bereavement jury duty type data
> p"~5
>  ngs_title:"bereavement leave type build bereavement maternity leave type
> data p"~5
>  ngs_title:"bereavement leave type build bereavement paternity type data
> p"~5
>  ngs_title:"bereavement leave type build bereavement paternity leave type
> data p"~5
>  ngs_title:"bereavement leave type build bereavement adoption leave type
> data p"~5
>  ngs_title:"bereavement leave type build jury duty maternity leave type
> data p"~5
>  ngs_title:"bereavement leave type build jury duty paternity type data p"~5
>  ngs_title:"bereavement leave type build jury duty paternity leave type
> data p"~5
>  ngs_title:"bereavement leave type build jury duty adoption leave type data
> p"~5
>  ngs_title:"bereavement leave type build jury duty absence type data p"~5
>  ngs_title:"bereavement leave type build maternity leave leave type data
> p"~5
>  ngs_title:"bereavement leave type build maternity leave bereavement type
> data p"~5
>  ngs_title:"bereavement leave type build maternity leave jury duty type
> data p"~5
>
> ....
>
> Previously in Solr 6 that same query, with the same synonyms (and query
> analysis chain) would produce a parsedQuery like when using a &ps=5:
> DisjunctionMaxQuery(((ngs_field_description:\"leave leave type build leave
> leave type data ? p leave leave type type.enabled\"~5)^3.0 |
> (ngs_title:\"leave leave type build leave leave type data ? p leave leave
> type type.enabled\"~5)^10.0)
>
> The expansion wasn't being applied to the added disjunctionMaxQuery to when
> adjusting rankings with phrase slop.
>
> In general the parsedqueries between 6 and 7 are differnet, with some new
> `spanNears` showing but they don't create the memory consumpution issues
> that I have seen when a large synonym expansion is happening along w/ using
> a PS parameter.
>
> I didn't see much in terms on release notes changes for synonym changes
> (outside of SOW=false being the default for version . 7).
>
> The field being opertated on has the following query analysis chain:
>
>  <analyzer type="query">
>         <tokenizer class="solr.StandardTokenizerFactory"/>
>         <filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
>         <filter class="solr.SynonymGraphFilterFactory"
> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>         <filter class="solr.LowerCaseFilterFactory"/>
>       </analyzer>
>
> Not sure if there is a change in phrase slop that now takes synonyms into
> account and if there is way to disable that kind of expansion or not. I am
> not sure if it is related to SOLR-10980
> <https://issues.apache.org/jira/plugins/servlet/mobile#issue/SOLR-10980> or
> not, does seem to be related,  but referenced Solr 6 which does not do the
> expansion.
>
> Any help would be greatly appreciated.
>
> Nick

Re: Synonym expansions w/ phrase slop exhausting memory after upgrading to SOLR 7

Reply via email to