gianm commented on a change in pull request #10613:
URL: https://github.com/apache/druid/pull/10613#discussion_r582985342
##########
File path:
processing/src/main/java/org/apache/druid/segment/vector/VectorColumnSelectorFactory.java
##########
@@ -62,16 +66,29 @@ default int getMaxVectorSize()
SingleValueDimensionVectorSelector
makeSingleValueDimensionSelector(DimensionSpec dimensionSpec);
/**
- * Returns a string-typed, multi-value-per-row column selector. Should only
be called on columns where
- * {@link #getColumnCapabilities} indicates they return STRING. Unlike
{@link #makeSingleValueDimensionSelector},
- * this should not be called on nonexistent columns.
+ * Returns a dictionary encoded, string-typed, multi-value-per-row column
selector. Should only be called on columns
+ * where {@link #getColumnCapabilities} indicates they return STRING. Unlike
+ * {@link #makeSingleValueDimensionSelector}, this should not be called on
nonexistent columns.
*
* If you need to write code that adapts to different input types, you
should write a
* {@link org.apache.druid.segment.VectorColumnProcessorFactory} and use one
of the
* {@link org.apache.druid.segment.ColumnProcessors#makeVectorProcessor}
functions instead of using this method.
*/
MultiValueDimensionVectorSelector
makeMultiValueDimensionSelector(DimensionSpec dimensionSpec);
+ /**
+ * Returns a dimension object selector. Useful for non-dictionary encoded
strings, or when using the dictionary
+ * isn't useful. Right now, this should probably only be called on single
valued STRING inputs (multi-value STRING
+ * vector object selector is not yet implemented), but perhaps a world can
be imagined where this could work with
+ * COMPLEX too perhaps if it were useful for them to interact with {@link
DimensionSpec} in some manner? May be called
+ * for non-existent columns.
Review comment:
These javadocs should explicitly constrain the type of Object that may
show up in the returned VectorObjectSelector. Is it always going to be a String
or a null? Might it be `String[]`, if the underlying column is multi-value?
##########
File path:
processing/src/main/java/org/apache/druid/query/filter/DruidPredicateFactory.java
##########
@@ -32,4 +32,15 @@
DruidFloatPredicate makeFloatPredicate();
DruidDoublePredicate makeDoublePredicate();
+
+ /**
+ * Object predicate is currently only used by vectorized matchers for
non-string object selectors. To preserve
+ * behavior with non-vectorized matchers which use a string predicate with
null inputs for these 'nil' matchers,
+ * do the same thing here...
+ */
+ default Predicate<Object> makeObjectPredicate()
Review comment:
Based on
https://github.com/apache/druid/pull/10613#discussion_r578985027 it seems like
the intent of this method is that it will only be called for complex types (or
maybe nonexistent columns too). If that's the case, please add some javadoc
explaining that this method will only be called for complex types. It's
analogous to "makeObjectProcessor" in "VectorColumnProcessorFactory". (You
could even `@see` to that method in the javadocs if you want.)
##########
File path:
processing/src/main/java/org/apache/druid/segment/column/ColumnCapabilitiesImpl.java
##########
@@ -177,6 +177,21 @@ public static ColumnCapabilitiesImpl
createSimpleNumericColumnCapabilities(Value
return builder;
}
+ /**
+ * Simple, non dictionary encoded string without bitmap index or anything
fancy
+ */
+ public static ColumnCapabilitiesImpl createSimpleStringColumnCapabilities()
+ {
+ return new ColumnCapabilitiesImpl().setType(ValueType.STRING)
+ .setHasMultipleValues(false)
Review comment:
Is it ok that hasMultipleValues is `false` here? That seems like the
potentially-sketchiest of the behaviors, since someone might use this method
for a multi-value string accidentally. How about naming it
`createSimpleSingleValueStringColumnCapabilities`?
Method names: the longer the better. Go, Java.
##########
File path:
processing/src/main/java/org/apache/druid/segment/vector/VectorColumnSelectorFactory.java
##########
@@ -62,16 +66,29 @@ default int getMaxVectorSize()
SingleValueDimensionVectorSelector
makeSingleValueDimensionSelector(DimensionSpec dimensionSpec);
/**
- * Returns a string-typed, multi-value-per-row column selector. Should only
be called on columns where
- * {@link #getColumnCapabilities} indicates they return STRING. Unlike
{@link #makeSingleValueDimensionSelector},
- * this should not be called on nonexistent columns.
+ * Returns a dictionary encoded, string-typed, multi-value-per-row column
selector. Should only be called on columns
+ * where {@link #getColumnCapabilities} indicates they return STRING. Unlike
+ * {@link #makeSingleValueDimensionSelector}, this should not be called on
nonexistent columns.
*
* If you need to write code that adapts to different input types, you
should write a
* {@link org.apache.druid.segment.VectorColumnProcessorFactory} and use one
of the
* {@link org.apache.druid.segment.ColumnProcessors#makeVectorProcessor}
functions instead of using this method.
*/
MultiValueDimensionVectorSelector
makeMultiValueDimensionSelector(DimensionSpec dimensionSpec);
+ /**
+ * Returns a dimension object selector. Useful for non-dictionary encoded
strings, or when using the dictionary
+ * isn't useful. Right now, this should probably only be called on single
valued STRING inputs (multi-value STRING
Review comment:
Get rid of the "probably" 🙂
Either it will work or it won't. If it doesn't work, say firmly not to do
it, and ideally describe what will happen if you do it accidentally (the method
will throw an error? the method will return a selector with undefined behavior?
etc). If it does work, but might be a bad idea for some reason, spell that out.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]