chenboat commented on a change in pull request #5013: For RANGE predicate
queries touching offline segments, use sorted inverted index if the column is
sorted
URL: https://github.com/apache/incubator-pinot/pull/5013#discussion_r370878722
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/FilterOperatorUtils.java
##########
@@ -56,16 +57,23 @@ public static BaseFilterOperator
getLeafFilterOperator(PredicateEvaluator predic
// Use inverted index if the predicate type is not RANGE or REGEXP_LIKE
for efficiency
DataSourceMetadata dataSourceMetadata = dataSource.getDataSourceMetadata();
Predicate.Type predicateType = predicateEvaluator.getPredicateType();
- if (dataSourceMetadata.hasInvertedIndex() && (predicateType !=
Predicate.Type.RANGE) && (predicateType
- != Predicate.Type.REGEXP_LIKE)) {
- if (dataSourceMetadata.isSorted()) {
+ if (dataSourceMetadata.hasInvertedIndex() && (predicateType !=
Predicate.Type.REGEXP_LIKE)) {
+ if (shouldUseSortedInvertedIndexOperator(dataSource, predicateType)) {
return new SortedInvertedIndexBasedFilterOperator(predicateEvaluator,
dataSource, startDocId, endDocId);
- } else {
+ } else if (predicateType != Predicate.Type.RANGE) {
+ // TODO: add support for bitmap inverted index operator can be used
for RANGE predicate
return new BitmapBasedFilterOperator(predicateEvaluator, dataSource,
startDocId, endDocId);
}
- } else {
- return new ScanBasedFilterOperator(predicateEvaluator, dataSource,
startDocId, endDocId);
}
+ return new ScanBasedFilterOperator(predicateEvaluator, dataSource,
startDocId, endDocId);
+ }
+
+ private static boolean shouldUseSortedInvertedIndexOperator(DataSource
dataSource, Predicate.Type predicateType) {
+ boolean isSorted = dataSource.getDataSourceMetadata().isSorted();
+ // we can sorted inverted index if:
+ // 1. column is sorted AND
+ // 2. predicate is not RANGE OR predicate is RANGE but this physical plan
is being built for offline segment
Review comment:
The condition here seems not consistent with the PR summary/title. For
example, take the first part of OR in (2). It says sorted inverted index can be
used if (1) column is sorted AND (2) predicate is not RANGE. This is not what
the PR title means. Am I missing anything here?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]