clintropolis commented on a change in pull request #12315:
URL: https://github.com/apache/druid/pull/12315#discussion_r822440348
##########
File path:
processing/src/main/java/org/apache/druid/segment/column/BitmapIndex.java
##########
@@ -23,28 +23,85 @@
import org.apache.druid.collections.bitmap.ImmutableBitmap;
import javax.annotation.Nullable;
+import java.util.Set;
+import java.util.function.Predicate;
/**
+ * Provides a mechanism to get {@link ImmutableBitmap} for a value, or {@link
Iterable} of {@link ImmutableBitmap}
+ * for a range or exact set of values, the set bits of which correspond to
which rows contain the matching values.
+ *
+ * Used to power {@link org.apache.druid.segment.BitmapOffset} and
+ * {@link org.apache.druid.segment.vector.BitmapVectorOffset} which are used
with column cursors for fast filtering
+ * of indexed values.
+ *
+ * The column must be ingested with a bitmap index for filters to use them and
to participate in this "pre-filtering"
+ * step when scanning segments
+ *
+ * @see org.apache.druid.segment.QueryableIndexStorageAdapter#analyzeFilter
*/
public interface BitmapIndex
{
+ /**
+ * Get the cardinality of the underlying value dictionary
+ */
int getCardinality();
+ /**
+ * Get the value in the underlying value dictionary of the specified
dictionary id
+ */
@Nullable
String getValue(int index);
+ /**
+ * Returns true if the underlying value dictionary has nulls
+ */
boolean hasNulls();
BitmapFactory getBitmapFactory();
/**
- * Returns the index of "value" in this BitmapIndex, or (-(insertion point)
- 1) if the value is not
- * present, in the manner of Arrays.binarySearch.
- *
- * @param value value to search for
- * @return index of value, or negative number equal to (-(insertion point) -
1).
+ * Returns the index of "value" in this BitmapIndex, or a negative value, if
not present in the underlying dictionary
*/
int getIndex(@Nullable String value);
+ /**
+ * Get the {@link ImmutableBitmap} for dictionary id of the underlying
dictionary
+ */
ImmutableBitmap getBitmap(int idx);
+
+ /**
+ * Get the {@link ImmutableBitmap} corresponding to the supplied value
+ */
+ ImmutableBitmap getBitmapForValue(@Nullable String value);
+
+ /**
+ * Get an {@link Iterable} of {@link ImmutableBitmap} corresponding to the
values supplied in the specified range
+ */
+ default Iterable<ImmutableBitmap> getBitmapsInRange(
Review comment:
I considered this, but it was a bigger pattern change than I was hoping
to deal with right now, so I'd like to save this for a follow-up.
The other thing that I'm unsure of is that currently those bitmap union
operations aren't really subject to query timeout or any other sort of
limiting, so it might be worth thinking on how we might add some limits to
prevent a very large number of bitmap operations from running until completion.
The answer may still be to push that enforcement into `BitmapIndex`, just need
to think a bit more about it first.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]