clintropolis commented on code in PR #13977:
URL: https://github.com/apache/druid/pull/13977#discussion_r1148825560
##########
processing/src/main/java/org/apache/druid/segment/column/ColumnConfig.java:
##########
@@ -22,4 +22,70 @@
public interface ColumnConfig
{
int columnCacheSizeBytes();
+
+ /**
+ * If the total number of rows in a column multiplied by this value is
smaller than the total number of bitmap
+ * index operations required to perform to use a {@link
LexicographicalRangeIndex} or {@link NumericRangeIndex},
+ * then for any {@link ColumnIndexSupplier} which chooses to participate in
this config it will skip computing the
+ * index, in favor of doing a full scan and using a {@link
org.apache.druid.query.filter.ValueMatcher} instead.
+ * This is indicated returning null from {@link
ColumnIndexSupplier#as(Class)} even though it would have otherwise
+ * been able to create a {@link BitmapColumnIndex}. For range indexes on
columns where every value has an index, the
Review Comment:
This was already true of `as` itself, returning `null` has always been an
indicator that the index we asked for is not available. I just made it a bit
clearer in the docs and also leaned into allowing the things `as` returns to
also return null to indicate that an index is not available.
For the types of short circuits done in this PR, i feel like the index
supplier has better information to make the decision than anything else above
it since it has direct access to the size of dictionary, number of rows to be
scanned, etc, but I will think on it
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]