[ https://issues.apache.org/jira/browse/SOLR-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16814727#comment-16814727 ]
Michael Gibney commented on SOLR-13336: --------------------------------------- +1, the patch looks good to me; thanks! > maxBooleanClauses ignored; can result in exponential expansion of naive > queries > ------------------------------------------------------------------------------- > > Key: SOLR-13336 > URL: https://issues.apache.org/jira/browse/SOLR-13336 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: query parsers > Affects Versions: 7.0, 7.6, master (9.0) > Reporter: Michael Gibney > Assignee: Hoss Man > Priority: Major > Attachments: SOLR-13336.patch, SOLR-13336.patch > > > Since SOLR-10921 it appears that Solr always sets > {{BooleanQuery.maxClauseCount}} (at the Lucene level) to > {{Integer.MAX_VALUE-1}}. I assume this is because Solr parses > {{maxBooleanClauses}} out of the config and applies it externally. > In any case, when used as part of > {{lucene.util.QueryBuilder.analyzeGraphPhrase}} (and possibly other places?), > the Lucene code checks internally against only the static {{maxClauseCount}} > variable (permanently set to {{Integer.MAX_VALUE-1}} in the context of Solr). > Thus in at least one case ({{analyzeGraphPhrase()}}, but possibly others?), > {{maxBooleanClauses}} is having no effect. I'm pretty sure this is what's > underlying the [issue reported here as being related to Solr > 7.6|https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201902.mbox/%3CCAF%3DheHE6-MOtn2XRbEg7%3D1tpNEGtE8GaChnOhFLPeJzpF18SGA%40mail.gmail.com%3E]. > To summarize, users are definitely susceptible (to varying degrees of likely > severity, assuming no actual _malicious_ attack) if: > # Running Solr >= 7.6.0 > # Using edismax with "ps" param set to >0 > # Query-time analysis chain is _at all_ capable of producing graphs (e.g., > WordDelimiterGraphFilter, SynonymGraphFilter that has corresponding synonyms > with varying token lengths. > Users are _particularly_ vulnerable in practice if they have query-time > {{WordDelimiterGraphFilter}} configured with {{preserveOriginal=true}}. > To clarify, Lucene/Solr 7.6 didn't exactly _introduce_ the issue; it only > increased the likelihood of problems manifesting (as a result of > LUCENE-8531). Notably, the "enumerated strings" approach to graph phrase > query (reintroduced by LUCENE-8531) was previously in place pre-6.5 – at > which point it could rely on default Lucene-level {{maxClauseCount}} failsafe > (removed as of 7.0). This explains the odd "Affects versions" => > maxBooleanClauses was disabled at the Lucene level (in Solr contexts) > starting with version 7.0, but the change became more likely to manifest > problems for users as of 7.6. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org