Jackie-Jiang commented on a change in pull request #7435:
URL: https://github.com/apache/pinot/pull/7435#discussion_r709590849
##########
File path:
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/RangeIndexReaderImpl.java
##########
@@ -217,4 +272,26 @@ public void close() {
// NOTE: DO NOT close the PinotDataBuffer here because it is tracked by
the caller and might be reused later. The
// caller is responsible of closing the PinotDataBuffer.
}
+
+ private ImmutableRoaringBitmap getMatchesInRange(int firstRangeId, int
lastRangeId) {
+ if (firstRangeId == lastRangeId) {
+ return null;
+ }
+ MutableRoaringBitmap matching = new MutableRoaringBitmap();
+ for (int rangeId = firstRangeId + 1; rangeId < lastRangeId; ++rangeId) {
Review comment:
Won't work if `lastRangeId` is -1?
##########
File path:
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/reader/RangeIndexReader.java
##########
@@ -0,0 +1,91 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.index.reader;
+
+import java.io.Closeable;
+
+/**
+ * Interface for indexed range queries
+ * @param <T>
+ */
+public interface RangeIndexReader<T> extends Closeable {
+ /**
+ * Returns doc ids with a value between min and max, both inclusive.
+ * @param min the inclusive lower bound.
+ * @param max the inclusive upper bound.
+ * @return the matching doc ids.
+ */
+ T getMatchingDocIds(long min, long max);
Review comment:
(nit) Re-order the methods following `int, long, float, double` to be
consistent with other places for readability
##########
File path:
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/RangeIndexReaderImpl.java
##########
@@ -217,4 +272,26 @@ public void close() {
// NOTE: DO NOT close the PinotDataBuffer here because it is tracked by
the caller and might be reused later. The
// caller is responsible of closing the PinotDataBuffer.
}
+
+ private ImmutableRoaringBitmap getMatchesInRange(int firstRangeId, int
lastRangeId) {
+ if (firstRangeId == lastRangeId) {
+ return null;
Review comment:
If both first and last is -1, it might be no match or full match. We
need to handle them separately
##########
File path:
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/reader/RangeIndexReader.java
##########
@@ -0,0 +1,91 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.index.reader;
+
+import java.io.Closeable;
+
+/**
+ * Interface for indexed range queries
+ * @param <T>
+ */
+public interface RangeIndexReader<T> extends Closeable {
Review comment:
Does it make sense to fix the type to `ImmutableBitmapDataProvider`?
##########
File path:
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/RangeIndexReaderImpl.java
##########
@@ -217,4 +272,26 @@ public void close() {
// NOTE: DO NOT close the PinotDataBuffer here because it is tracked by
the caller and might be reused later. The
// caller is responsible of closing the PinotDataBuffer.
}
+
+ private ImmutableRoaringBitmap getMatchesInRange(int firstRangeId, int
lastRangeId) {
Review comment:
Good to add some comments on how the -1 is handled (list the different
scenarios)
##########
File path:
pinot-segment-spi/src/main/java/org/apache/pinot/segment/spi/index/reader/RangeIndexReader.java
##########
@@ -0,0 +1,91 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.segment.spi.index.reader;
+
+import java.io.Closeable;
+
+/**
+ * Interface for indexed range queries
+ * @param <T>
+ */
+public interface RangeIndexReader<T> extends Closeable {
+ /**
+ * Returns doc ids with a value between min and max, both inclusive.
+ * @param min the inclusive lower bound.
+ * @param max the inclusive upper bound.
+ * @return the matching doc ids.
+ */
+ T getMatchingDocIds(long min, long max);
Review comment:
Annotate the methods return value as `Nullable` and add javadoc on when
to return `null`
##########
File path:
pinot-segment-local/src/main/java/org/apache/pinot/segment/local/segment/index/readers/RangeIndexReaderImpl.java
##########
@@ -217,4 +272,26 @@ public void close() {
// NOTE: DO NOT close the PinotDataBuffer here because it is tracked by
the caller and might be reused later. The
// caller is responsible of closing the PinotDataBuffer.
}
+
+ private ImmutableRoaringBitmap getMatchesInRange(int firstRangeId, int
lastRangeId) {
+ if (firstRangeId == lastRangeId) {
+ return null;
+ }
+ MutableRoaringBitmap matching = new MutableRoaringBitmap();
+ for (int rangeId = firstRangeId + 1; rangeId < lastRangeId; ++rangeId) {
Review comment:
(nit) We usually put `rangeId++`
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/RangeIndexBasedFilterOperator.java
##########
@@ -51,74 +51,84 @@ public RangeIndexBasedFilterOperator(PredicateEvaluator
rangePredicateEvaluator,
@Override
protected FilterBlock getNextBlock() {
- RangeIndexReader rangeIndexReader = (RangeIndexReader)
_dataSource.getRangeIndex();
+ @SuppressWarnings("unchecked")
+ RangeIndexReader<ImmutableRoaringBitmap> rangeIndexReader =
+ (RangeIndexReader<ImmutableRoaringBitmap>) _dataSource.getRangeIndex();
assert rangeIndexReader != null;
+ ImmutableRoaringBitmap matches;
+ // if the implementation cannot match the entire query exactly, it will
+ // yield partial matches, which need to be verified by scanning. If it
+ // can answer the query exactly, this will be null.
+ ImmutableRoaringBitmap partialMatches;
int firstRangeId;
int lastRangeId;
if (_rangePredicateEvaluator instanceof
SortedDictionaryBasedRangePredicateEvaluator) {
- firstRangeId = rangeIndexReader
- .findRangeId(((SortedDictionaryBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getStartDictId());
// NOTE: End dictionary id is exclusive in
OfflineDictionaryBasedRangePredicateEvaluator.
- lastRangeId = rangeIndexReader
- .findRangeId(((SortedDictionaryBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getEndDictId() - 1);
+ matches = rangeIndexReader.getMatchingDocIds(
+ ((SortedDictionaryBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getStartDictId(),
+ ((SortedDictionaryBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getEndDictId() - 1);
+ partialMatches = rangeIndexReader.getPartiallyMatchingDocIds(
+ ((SortedDictionaryBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getStartDictId(),
+ ((SortedDictionaryBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getEndDictId() - 1);
} else {
switch (_rangePredicateEvaluator.getDataType()) {
case INT:
- firstRangeId = rangeIndexReader
- .findRangeId(((IntRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound());
- lastRangeId = rangeIndexReader
- .findRangeId(((IntRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
+ matches = rangeIndexReader.getMatchingDocIds(
+ ((IntRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound(),
+ ((IntRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
+ partialMatches = rangeIndexReader.getPartiallyMatchingDocIds(
+ ((IntRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound(),
+ ((IntRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
break;
case LONG:
- firstRangeId = rangeIndexReader
- .findRangeId(((LongRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound());
- lastRangeId = rangeIndexReader
- .findRangeId(((LongRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
+ matches = rangeIndexReader.getMatchingDocIds(
+ ((LongRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound(),
+ ((LongRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
+ partialMatches = rangeIndexReader.getPartiallyMatchingDocIds(
+ ((LongRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound(),
+ ((LongRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
break;
case FLOAT:
- firstRangeId = rangeIndexReader
- .findRangeId(((FloatRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound());
- lastRangeId = rangeIndexReader
- .findRangeId(((FloatRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
+ matches = rangeIndexReader.getMatchingDocIds(
+ ((FloatRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound(),
+ ((FloatRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
+ partialMatches = rangeIndexReader.getPartiallyMatchingDocIds(
+ ((FloatRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound(),
+ ((FloatRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
break;
case DOUBLE:
- firstRangeId = rangeIndexReader
- .findRangeId(((DoubleRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound());
- lastRangeId = rangeIndexReader
- .findRangeId(((DoubleRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
+ matches = rangeIndexReader.getMatchingDocIds(
+ ((DoubleRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound(),
+ ((DoubleRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
+ partialMatches = rangeIndexReader.getPartiallyMatchingDocIds(
+ ((DoubleRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).geLowerBound(),
+ ((DoubleRawValueBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getUpperBound());
break;
default:
throw new IllegalStateException("String and Bytes data type not
supported for Range Indexing");
}
}
-
- // Need to scan the first and last range as they might be partially matched
- // TODO: Detect fully matched first and last range
- ImmutableRoaringBitmap docIdsToScan;
- if (firstRangeId == lastRangeId) {
- docIdsToScan = rangeIndexReader.getDocIds(firstRangeId);
+ // this branch is likely until RangeIndexReader reimplemented and enabled
by default
+ if (null != partialMatches) {
Review comment:
(nit) we usually put `(partialMatches != null)`, same for other places
##########
File path:
pinot-core/src/main/java/org/apache/pinot/core/operator/filter/RangeIndexBasedFilterOperator.java
##########
@@ -51,74 +51,84 @@ public RangeIndexBasedFilterOperator(PredicateEvaluator
rangePredicateEvaluator,
@Override
protected FilterBlock getNextBlock() {
- RangeIndexReader rangeIndexReader = (RangeIndexReader)
_dataSource.getRangeIndex();
+ @SuppressWarnings("unchecked")
+ RangeIndexReader<ImmutableRoaringBitmap> rangeIndexReader =
+ (RangeIndexReader<ImmutableRoaringBitmap>) _dataSource.getRangeIndex();
assert rangeIndexReader != null;
+ ImmutableRoaringBitmap matches;
+ // if the implementation cannot match the entire query exactly, it will
+ // yield partial matches, which need to be verified by scanning. If it
+ // can answer the query exactly, this will be null.
+ ImmutableRoaringBitmap partialMatches;
int firstRangeId;
int lastRangeId;
if (_rangePredicateEvaluator instanceof
SortedDictionaryBasedRangePredicateEvaluator) {
- firstRangeId = rangeIndexReader
- .findRangeId(((SortedDictionaryBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getStartDictId());
// NOTE: End dictionary id is exclusive in
OfflineDictionaryBasedRangePredicateEvaluator.
- lastRangeId = rangeIndexReader
- .findRangeId(((SortedDictionaryBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getEndDictId() - 1);
+ matches = rangeIndexReader.getMatchingDocIds(
+ ((SortedDictionaryBasedRangePredicateEvaluator)
_rangePredicateEvaluator).getStartDictId(),
Review comment:
(nit) Cache the `min` and `max` value
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]