cryptoe commented on a change in pull request #12253:
URL: https://github.com/apache/druid/pull/12253#discussion_r806136177
##########
File path: docs/querying/multi-value-dimensions.md
##########
@@ -375,3 +375,8 @@ This query returns the following result:
Note that, for groupBy queries, you could get similar result with a [having
spec](having.md) but using a filtered
`dimensionSpec` is much more efficient because that gets applied at the lowest
level in the query processing pipeline.
Having specs are applied at the outermost level of groupBy query processing.
+
+## Disable GroupBy on multivalue columns
+
+As grouping on multivalue columns causes implicit unnest, users can avoid this
behaviour by setting
+`groupByEnableMultiValueUnnesting` in the query context to `false`. This will
result the query to error out.
Review comment:
In the latest draft I have kept the documentation and changed the error
messaging a bit.
Update the "GroupBy v2 configurations" as well.
##########
File path:
processing/src/main/java/org/apache/druid/query/groupby/epinephelinae/GroupByQueryEngineV2.java
##########
@@ -327,35 +345,46 @@ public static int getCardinalityForArrayAggregation(
}
/**
- * Checks whether all "dimensions" are either single-valued, or if the input
column or output dimension spec has
- * specified a type that {@link ColumnType#isArray()}. Both cases indicate
we don't want to explode the under-lying
- * multi value column. Since selectors on non-existent columns will show up
as full of nulls, they are effectively
- * single valued, however capabilites on columns can also be null, for
example during broker merge with an 'inline'
- * datasource subquery, so we only return true from this method when
capabilities are fully known.
+ * Returns all dimension names that are or could be multi valued, or if the
input column specified a type that
+ * {@link ColumnType#isArray()}. Both cases indicate we want to explode the
under-lying multi value column. Since
+ * selectors on non-existent columns will show up as full of nulls, they are
effectively single valued, however
+ * capabilites on columns can also be null, for example during broker merge
with an 'inline' datasource subquery. We
+ * mark columns with null capabilites as candidates for explosion.
*/
- public static boolean hasNoExplodingDimensions(
+ public static List<String> findAllProbableExplodingDimensions(
Review comment:
Thanks for the catch. Updated the check while doing row by row
iteration.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]