: I'm happy to provide some details as I still do not really understand the
: difference to the situation before.
The main difference is coming from the changes introduced in LUCENE-8811
(Lucene 9.0) which sought to ensure that the "global" maxClauseCount would
be honored no matter what kind of nested structure the query might
involve.
You're situation is an interesting case that i had never considered, more
detais below...
: * I upgraded from 8.11.1 to 9.1. I observed the behavior for a completely
: rebuild index (solr version 9.1 / lucene version 9.3)
thank you for clarifing. This confirms that changes introduced
in LUCENE-8811 (and related solr issues) are relavant to the change in
behavior you are seeing (if you had said you upgraded from Solr 9 we'd be
having a different conversation)
: * maxBooleanClauses is only configured in solrconfig.xml (1024) but not in
: solr.xml.
FYI: If you don't configure in solr.xml, then the (Lucene) default
IndexSearcher.getMaxClauseCount() is left as is (and that is also 1024)
: * Sorry for the confusion about the field definition. As you already
: assumed correctly: 'categoryId' is also a 'p_long_dv'
Meaning that it has both points nad docvalues configured, which it turns
out is significant to why it behaves differently from a string field.
: * Stacktrace for String field ("id"). For better readability I replaced the
: original query by "1 2 ... 1025":
Snipping down to the key lines of code from the root cause...
: Caused by: org.apache.lucene.search.IndexSearcher$TooManyClauses:
: maxClauseCount is set to 1024
: at
: org.apache.lucene.search.BooleanQuery$Builder.add(BooleanQuery.java:116)
: at
: org.apache.lucene.search.BooleanQuery$Builder.add(BooleanQuery.java:130)
: at
:
org.apache.solr.parser.SolrQueryParserBase.rawToNormal(SolrQueryParserBase.java:1065)
...so in this case, as the query parser is building up a boolean query (of
many strings), it is hitting the limit because the (top level) boolean
query is being asked to add one more item then
IndexSearcher.getMaxClauseCount() == 1024
: * Stacktrace for Point field ("categoryId") with 1 2 ... 513:
Again, snipping down to just the key lines of code. (Note also the
difference in the exception message: "too many nested clauses") ..
: org.apache.lucene.search.IndexSearcher$TooManyNestedClauses: Query contains
: too many nested clauses; maxClauseCount is set to 1024
: at
: org.apache.lucene.search.IndexSearcher$3.visitLeaf(IndexSearcher.java:801)
: at
:
org.apache.lucene.document.SortedNumericDocValuesRangeQuery.visit(SortedNumericDocValuesRangeQuery.java:73)
: at
:
org.apache.lucene.search.IndexOrDocValuesQuery.visit(IndexOrDocValuesQuery.java:121)
: at
: org.apache.lucene.search.BooleanQuery.visit(BooleanQuery.java:575)
: at
: org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:769)
...here the exception is happening during the actual search -- meaning the
query parser had no problem building up the BooleanQuery of 512 clauses
But what matters is that each of those 512 clauses is no longer a simple
exact term query (or a simple exact point query, or a simple exact
docvalue query) ... because this fieldType is configured to support both
points and docvalues, those 512 clauses are IndexOrDocValuesQuery queries
-- which each contain 2 sub-clauses
(the purpose of this class is to provide teh most efficient impl based on
where/how this clause is used, which can depend on term stats, other
clauses in the parent query, etc...)
So to sumarize:
1) the reason you're seeing this behavior in 9x but didnt' in 8x is
because 9x added more checks of the safety valve
2) the reason you're seeing the 1024 limit hit for some (but not all)
fields, even with with less then 1024 "original user query clauses" is
because for some (but not all) field types, 1 original query clause can
become N internal clauses.
-Hoss
http://www.lucidworks.com/