Tanapol Nearunchorn created SOLR-12084:

             Summary: ShingleFilter cause threads consume all available memory
                 Key: SOLR-12084
                 URL: https://issues.apache.org/jira/browse/SOLR-12084
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: Schema and Analysis
    Affects Versions: 7.0.1, 6.5.1, 6.5
            Reporter: Tanapol Nearunchorn

When putting ShingleFilter on query analyzer and after some specific query 
patterns go through Solr, it causes all of handlers thread to hold a large 
amount of SpanNearQuery objects and consume all available memory.

My query analyzer looks like this:
<analyzer type="query">
    <tokenizer class="solr.StandardTokenizerFactory" />
    <filter class="solr.ASCIIFoldingFilterFactory" preserveOriginal="false" />
    <filter class="solr.WordDelimiterGraphFilterFactory" preserveOriginal="0" />
    <filter class="solr.LowerCaseFilterFactory" />
    <filter class="solr.ShingleFilterFactory" tokenSeparator="" 
maxShingleSize="3" />
After I tested with queries, it seems that the number of terms passing to 
ShingleFilter directly effect Solr memory usage. If ShingleFilter got 10-15 
terms as input, it takes much memory to process the request, so multiply with 
concurrent make problem goes worse.

Not sure how to handle this problem, maybe we can put an upper limit number of 
terms produced by ShingleFilter or should we optimize something?

Thank you.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to