benwtrent commented on code in PR #15124:
URL: https://github.com/apache/lucene/pull/15124#discussion_r2369738662
##########
lucene/core/src/java/org/apache/lucene/search/BooleanQuery.java:
##########
@@ -137,9 +141,36 @@ private BooleanQuery(int minimumNumberShouldMatch,
BooleanClause[] clauses) {
// but not for FILTER and MUST_NOT
clauseSets.put(Occur.FILTER, new HashSet<>());
clauseSets.put(Occur.MUST_NOT, new HashSet<>());
+ // We store the queries per clauses in a list beforehand. As otherwise
during repeated visit()
+ // calls(like in QueryCache), we need to iterate over clauseSets.keySet()
which allocates
+ // HashMap$HashSetIterator.<init>
+ // for every call causing some performance impact
+
+ List<Query> mustClauseQueries = new ArrayList<>();
+ List<Query> mustNotClauseQueries = new ArrayList<>();
+ List<Query> filterClauseQueries = new ArrayList<>();
+ List<Query> shouldClauseQueries = new ArrayList<>();
Review Comment:
I really don't think we should be adding so much complexity to boolean. My
concern is that it isn't worth it. This will add significant overhead to
boolean queries and we shouldn't do it.
With caching the query size, and all your other improvements, is this now
within 2-3x slower in the worst case?
##########
lucene/core/src/java/org/apache/lucene/search/LRUQueryCache.java:
##########
@@ -705,6 +710,17 @@ public long ramBytesUsed() {
}
}
+ // pkg-private for testing
+ static class Record {
Review Comment:
No, i mean this should be a `record`, not named such.
https://docs.oracle.com/en/java/javase/17/language/records.html
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]