gianm commented on a change in pull request #10613: URL: https://github.com/apache/druid/pull/10613#discussion_r582985342
########## File path: processing/src/main/java/org/apache/druid/segment/vector/VectorColumnSelectorFactory.java ########## @@ -62,16 +66,29 @@ default int getMaxVectorSize() SingleValueDimensionVectorSelector makeSingleValueDimensionSelector(DimensionSpec dimensionSpec); /** - * Returns a string-typed, multi-value-per-row column selector. Should only be called on columns where - * {@link #getColumnCapabilities} indicates they return STRING. Unlike {@link #makeSingleValueDimensionSelector}, - * this should not be called on nonexistent columns. + * Returns a dictionary encoded, string-typed, multi-value-per-row column selector. Should only be called on columns + * where {@link #getColumnCapabilities} indicates they return STRING. Unlike + * {@link #makeSingleValueDimensionSelector}, this should not be called on nonexistent columns. * * If you need to write code that adapts to different input types, you should write a * {@link org.apache.druid.segment.VectorColumnProcessorFactory} and use one of the * {@link org.apache.druid.segment.ColumnProcessors#makeVectorProcessor} functions instead of using this method. */ MultiValueDimensionVectorSelector makeMultiValueDimensionSelector(DimensionSpec dimensionSpec); + /** + * Returns a dimension object selector. Useful for non-dictionary encoded strings, or when using the dictionary + * isn't useful. Right now, this should probably only be called on single valued STRING inputs (multi-value STRING + * vector object selector is not yet implemented), but perhaps a world can be imagined where this could work with + * COMPLEX too perhaps if it were useful for them to interact with {@link DimensionSpec} in some manner? May be called + * for non-existent columns. Review comment: These javadocs should explicitly constrain the type of Object that may show up in the returned VectorObjectSelector. Is it always going to be a String or a null? Might it be `String[]`, if the underlying column is multi-value? ########## File path: processing/src/main/java/org/apache/druid/query/filter/DruidPredicateFactory.java ########## @@ -32,4 +32,15 @@ DruidFloatPredicate makeFloatPredicate(); DruidDoublePredicate makeDoublePredicate(); + + /** + * Object predicate is currently only used by vectorized matchers for non-string object selectors. To preserve + * behavior with non-vectorized matchers which use a string predicate with null inputs for these 'nil' matchers, + * do the same thing here... + */ + default Predicate<Object> makeObjectPredicate() Review comment: Based on https://github.com/apache/druid/pull/10613#discussion_r578985027 it seems like the intent of this method is that it will only be called for complex types (or maybe nonexistent columns too). If that's the case, please add some javadoc explaining that this method will only be called for complex types. It's analogous to "makeObjectProcessor" in "VectorColumnProcessorFactory". (You could even `@see` to that method in the javadocs if you want.) ########## File path: processing/src/main/java/org/apache/druid/segment/column/ColumnCapabilitiesImpl.java ########## @@ -177,6 +177,21 @@ public static ColumnCapabilitiesImpl createSimpleNumericColumnCapabilities(Value return builder; } + /** + * Simple, non dictionary encoded string without bitmap index or anything fancy + */ + public static ColumnCapabilitiesImpl createSimpleStringColumnCapabilities() + { + return new ColumnCapabilitiesImpl().setType(ValueType.STRING) + .setHasMultipleValues(false) Review comment: Is it ok that hasMultipleValues is `false` here? That seems like the potentially-sketchiest of the behaviors, since someone might use this method for a multi-value string accidentally. How about naming it `createSimpleSingleValueStringColumnCapabilities`? Method names: the longer the better. Go, Java. ########## File path: processing/src/main/java/org/apache/druid/segment/vector/VectorColumnSelectorFactory.java ########## @@ -62,16 +66,29 @@ default int getMaxVectorSize() SingleValueDimensionVectorSelector makeSingleValueDimensionSelector(DimensionSpec dimensionSpec); /** - * Returns a string-typed, multi-value-per-row column selector. Should only be called on columns where - * {@link #getColumnCapabilities} indicates they return STRING. Unlike {@link #makeSingleValueDimensionSelector}, - * this should not be called on nonexistent columns. + * Returns a dictionary encoded, string-typed, multi-value-per-row column selector. Should only be called on columns + * where {@link #getColumnCapabilities} indicates they return STRING. Unlike + * {@link #makeSingleValueDimensionSelector}, this should not be called on nonexistent columns. * * If you need to write code that adapts to different input types, you should write a * {@link org.apache.druid.segment.VectorColumnProcessorFactory} and use one of the * {@link org.apache.druid.segment.ColumnProcessors#makeVectorProcessor} functions instead of using this method. */ MultiValueDimensionVectorSelector makeMultiValueDimensionSelector(DimensionSpec dimensionSpec); + /** + * Returns a dimension object selector. Useful for non-dictionary encoded strings, or when using the dictionary + * isn't useful. Right now, this should probably only be called on single valued STRING inputs (multi-value STRING Review comment: Get rid of the "probably" 🙂 Either it will work or it won't. If it doesn't work, say firmly not to do it, and ideally describe what will happen if you do it accidentally (the method will throw an error? the method will return a selector with undefined behavior? etc). If it does work, but might be a bad idea for some reason, spell that out. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org