[ 
https://issues.apache.org/jira/browse/SOLR-13336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16818267#comment-16818267
 ] 

ASF subversion and git services commented on SOLR-13336:
--------------------------------------------------------

Commit 59a3c45d9cc1a338c3dffbe5e7bd996a8e0dd37a in lucene-solr's branch 
refs/heads/branch_8x from Chris M. Hostetter
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=59a3c45 ]

SOLR-13336: add maxBooleanClauses (default to 1024) setting to solr.xml, 
reverting previous effective value of Integer.MAX_VALUE-1, to restrict risk of 
pathalogical query expansion.

(cherry picked from commit d90034f0d61cd1525e10d07cf064a8647dc08cc9)


> solrconfig.xml maxBooleanClauses ignored by programtic/rewrtten queries; can 
> result in exponential expansion of naive queries
> -----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-13336
>                 URL: https://issues.apache.org/jira/browse/SOLR-13336
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: 7.0, 8.0
>            Reporter: Michael Gibney
>            Assignee: Hoss Man
>            Priority: Major
>             Fix For: 8.1
>
>         Attachments: SOLR-13336.patch, SOLR-13336.patch, SOLR-13336.patch
>
>
> changes made in Solr 7.0 set the effective value of 
> {{BoleanQuery.getMaxClauseCount}} to {{Integer.MAX_VALUE-1}} and only 
> impossed a restriction based on the (existing) solrconfig.xml setting  at the 
> Solr query parser level via a new utility helper method.l
> But this means programatically generated queries (either by low level lucene 
> methods, or by query re-writing) no longer had any safety valve to prevent 
> (effectively) infinite expansion.  This issue fixes this problem by:
> * restoring a default upper bound on {{BoleanQuery.getMaxClauseCount}} of 1024
> * introducing a new solr.xml level setting for configuring this upper 
> bound:{noformat}
> <int name="maxBooleanClauses">${solr.max.booleanClauses:1024}</int>
> {noformat}
> *NOTE* that this solr.xml limit is ahard upper bound, that superceeds the 
> existing solrconfig.xml setting, which has been left in place and still 
> limits the size of user specified boolean queries.  ie: solr.xml 
> maxBooleanClauses >= solrconfig.xml maxBooleanClauses >= number of clauses a 
> user explicitly specifies in a query string; solr.xml maxBooleanClauses >= 
> numberr of clauses in an expanded/rewritten query
> {panel:title=original bug report}
> Since SOLR-10921 it appears that Solr always sets 
> {{BooleanQuery.maxClauseCount}} (at the Lucene level) to 
> {{Integer.MAX_VALUE-1}}. I assume this is because Solr parses 
> {{maxBooleanClauses}} out of the config and applies it externally.
> In any case, when used as part of 
> {{lucene.util.QueryBuilder.analyzeGraphPhrase}} (and possibly other places?), 
> the Lucene code checks internally against only the static {{maxClauseCount}} 
> variable (permanently set to {{Integer.MAX_VALUE-1}} in the context of Solr).
> Thus in at least one case ({{analyzeGraphPhrase()}}, but possibly others?), 
> {{maxBooleanClauses}} is having no effect. I'm pretty sure this is what's 
> underlying the [issue reported here as being related to Solr 
> 7.6|https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201902.mbox/%3CCAF%3DheHE6-MOtn2XRbEg7%3D1tpNEGtE8GaChnOhFLPeJzpF18SGA%40mail.gmail.com%3E].
> To summarize, users are definitely susceptible (to varying degrees of likely 
> severity, assuming no actual _malicious_ attack) if:
>  # Running Solr >= 7.6.0
>  # Using edismax with "ps" param set to >0
>  # Query-time analysis chain is _at all_ capable of producing graphs (e.g., 
> WordDelimiterGraphFilter, SynonymGraphFilter that has corresponding synonyms 
> with varying token lengths.
> Users are _particularly_ vulnerable in practice if they have query-time 
> {{WordDelimiterGraphFilter}} configured with {{preserveOriginal=true}}.
> To clarify, Lucene/Solr 7.6 didn't exactly _introduce_ the issue; it only 
> increased the likelihood of problems manifesting (as a result of 
> LUCENE-8531). Notably, the "enumerated strings" approach to graph phrase 
> query (reintroduced by LUCENE-8531) was previously in place pre-6.5 – at 
> which point it could rely on default Lucene-level {{maxClauseCount}} failsafe 
> (removed as of 7.0). This explains the odd "Affects versions" => 
> maxBooleanClauses was disabled at the Lucene level (in Solr contexts) 
> starting with version 7.0, but the change became more likely to manifest 
> problems for users as of 7.6.
> {panel}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to