gortiz commented on code in PR #8766:
URL: https://github.com/apache/pinot/pull/8766#discussion_r882409290
##########
pinot-core/src/main/java/org/apache/pinot/core/query/pruner/ColumnValueSegmentPruner.java:
##########
@@ -91,23 +95,28 @@ public List<IndexSegment> prune(List<IndexSegment>
segments, QueryContext query)
// Extract EQ/IN/RANGE predicate columns
Set<String> eqInColumns = new HashSet<>();
Set<String> rangeColumns = new HashSet<>();
- extractPredicateColumns(filter, eqInColumns, rangeColumns);
+ // As Predicates are recursive structures, their hashCode is quite
expensive.
+ // By using an IdentityHashMap here we don't need to iterate over the
recursive
+ // structure. This is specially useful in the IN expression.
+ Map<Predicate, Object> cachedValues = new IdentityHashMap<>();
+ extractPredicateColumns(filter, eqInColumns, rangeColumns, cachedValues);
if (eqInColumns.isEmpty() && rangeColumns.isEmpty()) {
return segments;
}
int numSegments = segments.size();
List<IndexSegment> selectedSegments = new ArrayList<>(numSegments);
+
if (!eqInColumns.isEmpty() && query.isEnablePrefetch()) {
Map[] dataSourceCaches = new Map[numSegments];
FetchContext[] fetchContexts = new FetchContext[numSegments];
try {
// Prefetch bloom filter for columns within the EQ/IN predicate if
exists
for (int i = 0; i < numSegments; i++) {
IndexSegment segment = segments.get(i);
- Map<String, DataSource> dataSourceCache = new HashMap<>();
- Map<String, List<ColumnIndexType>> columnToIndexList = new
HashMap<>();
+ Map<String, DataSource> dataSourceCache = new
HashMap<>(eqInColumns.size());
Review Comment:
The reason I added the size is because I saw some resizes in the flamegraph.
Anyway, the new implementation doesn't use `dataSourceCache` when
immutableSegments are used, so it isn't going to be as remarkable as before.
IMHO if the map is very small doesn't really matter if there are too many
collisions, as a linear probe will be fast. In fact, if the map is very small,
probably an ArrayList that filters by hashcode before equals would be than a
map.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]