Tanapol Nearunchorn created SOLR-12084: ------------------------------------------
Summary: ShingleFilter cause threads consume all available memory Key: SOLR-12084 URL: https://issues.apache.org/jira/browse/SOLR-12084 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: Schema and Analysis Affects Versions: 7.0.1, 6.5.1, 6.5 Reporter: Tanapol Nearunchorn When putting ShingleFilter on query analyzer and after some specific query patterns go through Solr, it causes all of handlers thread to hold a large amount of SpanNearQuery objects and consume all available memory. My query analyzer looks like this: {code:java} <analyzer type="query"> <tokenizer class="solr.StandardTokenizerFactory" /> <filter class="solr.ASCIIFoldingFilterFactory" preserveOriginal="false" /> <filter class="solr.WordDelimiterGraphFilterFactory" preserveOriginal="0" /> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.ShingleFilterFactory" tokenSeparator="" maxShingleSize="3" /> </analyzer>{code} After I tested with queries, it seems that the number of terms passing to ShingleFilter directly effect Solr memory usage. If ShingleFilter got 10-15 terms as input, it takes much memory to process the request, so multiply with concurrent make problem goes worse. Not sure how to handle this problem, maybe we can put an upper limit number of terms produced by ShingleFilter or should we optimize something? Thank you. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org