[GitHub] [incubator-druid] gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform
gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform URL: https://github.com/apache/incubator-druid/pull/8744#discussion_r345459910 ## File path: processing/src/main/java/org/apache/druid/segment/virtual/ExpressionSelectors.java ## @@ -514,15 +509,45 @@ public void inspectRuntimeShape(RuntimeShapeInspector inspector) /** * Selectors are not consistent in treatment of null, [], and [null], so coerce [] to [null] */ - private static Object coerceListDimToStringArray(List val) + // suppressed because calling toArray creates Object[] instead of Long[] which makes ExprEval.bestEffortOf sad + @SuppressWarnings("SimplifyStreamApiCallChains") + public static Object coerceListToArray(List val) { -Object[] arrayVal = val.stream().map(x -> x != null ? x.toString() : x).toArray(String[]::new); -if (arrayVal.length > 0) { - return arrayVal; +if (val != null && val.size() > 0) { + Object firstElement = val.get(0); Review comment: Expressions support lists that contain nulls, right? What happens if the first element is null? Also, what happens if the first element is an Integer but later ones are Doubles? i.e. JSON `[1, 2.0]`. It may be better to examine all elements using some kind of binary type conversion rules (like long + int = long, etc). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform
gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform URL: https://github.com/apache/incubator-druid/pull/8744#discussion_r344004645 ## File path: processing/src/main/java/org/apache/druid/segment/virtual/ExpressionSelectors.java ## @@ -518,6 +513,20 @@ private static Object coerceListDimToStringArray(List val) return new String[]{null}; } + /** + * Coerces {@link ExprEval} value back to selector friendly {@link List} if the evaluated expression result is an + * array type + */ + public static Object coerceEvalToSelectorObject(ExprEval eval) + { +if (eval.isArray()) { Review comment: If converting to a list of the base type works, let's do that, since it's nicer. My goal here (and w/ the other comment) is to avoid making too many places that encode a lists-should-always-be-strings assumption. We might want to expand that later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform
gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform URL: https://github.com/apache/incubator-druid/pull/8744#discussion_r344004453 ## File path: processing/src/main/java/org/apache/druid/segment/transform/ExpressionTransform.java ## @@ -90,7 +92,11 @@ private static Object getValueFromRow(final Row row, final String column) if (column.equals(ColumnHolder.TIME_COLUMN_NAME)) { return row.getTimestampFromEpoch(); } else { - return row.getRaw(column); + Object raw = row.getRaw(column); + if (raw instanceof List) { +return ExpressionSelectors.coerceListDimToStringArray((List) raw); Review comment: This isn't a multi-value column, though, is it? It looks like it's something from an input row and it could be an array of anything. I was thinking that since expressions support numeric arrays, it'd be good to keep them that way if found in the input rows. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform
gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform URL: https://github.com/apache/incubator-druid/pull/8744#discussion_r343354749 ## File path: processing/src/main/java/org/apache/druid/segment/transform/ExpressionTransform.java ## @@ -90,7 +92,11 @@ private static Object getValueFromRow(final Row row, final String column) if (column.equals(ColumnHolder.TIME_COLUMN_NAME)) { return row.getTimestampFromEpoch(); } else { - return row.getRaw(column); + Object raw = row.getRaw(column); + if (raw instanceof List) { +return ExpressionSelectors.coerceListDimToStringArray((List) raw); Review comment: Is it fair to assume that all Lists are Lists of Strings? What if the expression selector returns a long array? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform
gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform URL: https://github.com/apache/incubator-druid/pull/8744#discussion_r343346060 ## File path: core/src/main/java/org/apache/druid/data/input/impl/ParseSpec.java ## @@ -64,12 +62,6 @@ public DimensionsSpec getDimensionsSpec() return dimensionsSpec; } - @PublicApi - public void verify(List usedCols) Review comment: This `verify` method is useful for making sure that people's transforms, dimensions, metrics, etc are derived from fields they specified (to help detect errors & typos). But I see why you removed it -- info about transforms isn't available at this point in the code. It probably makes sense to add this functionality back in somehow, in a smarter way, once the dust settles on #8823. /cc @jihoonson This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform
gianm commented on a change in pull request #8744: support for array expressions in TransformSpec with ExpressionTransform URL: https://github.com/apache/incubator-druid/pull/8744#discussion_r343355536 ## File path: processing/src/main/java/org/apache/druid/segment/virtual/ExpressionSelectors.java ## @@ -518,6 +513,20 @@ private static Object coerceListDimToStringArray(List val) return new String[]{null}; } + /** + * Coerces {@link ExprEval} value back to selector friendly {@link List} if the evaluated expression result is an + * array type + */ + public static Object coerceEvalToSelectorObject(ExprEval eval) + { +if (eval.isArray()) { Review comment: Is it fair to cast all arrays to lists of strings? Do we still get the behavior we want if we cast them to lists of whatever they really are? (I'm thinking this could make us more future-proof to situations where we support numeric arrays at the storage layer.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org