ektravel commented on code in PR #12549:
URL: https://github.com/apache/druid/pull/12549#discussion_r1099182443


##########
docs/querying/sql-data-types.md:
##########
@@ -80,8 +82,42 @@ the `UNNEST` functionality available in some other SQL 
dialects. Refer to the do
 > they are handled in Druid SQL and in native queries. For example, 
 > expressions involving multi-value dimensions may be
 > incorrectly optimized by the Druid SQL planner: `multi_val_dim = 'a' AND 
 > multi_val_dim = 'b'` will be optimized to
 > `false`, even though it is possible for a single row to have both "a" and 
 > "b" as values for `multi_val_dim`. The
-> SQL behavior of multi-value dimensions will change in a future release to 
more closely align with their behavior
-> in native queries.
+> SQL behavior of multi-value dimensions may change in a future release to 
more closely align with their behavior
+> in native queries, but the [multi-value string 
functions](./sql-multivalue-string-functions.md) should be able to provide
+> nearly all possible native functionality.
+
+## Arrays
+Multi-value dimensions may also be converted to standard SQL arrays, either by 
explicitly converting them with `MV_TO_ARRAY`,
+or implicitly when used within the [array 
functions](./sql-array-functions.md). `ARRAY` types behave as standard SQL 
arrays, where
+grouping on them will group on the entire array of values instead of the 
implicit `UNNEST` that occurs when grouping on
+multi-value dimensions directly or when used with the multi-value functions. 
Arrays may also be constructed from multiple
+columns using the array functions.
+
+## Multi-value strings behavior
+The behavior of Druid [multi-value string 
dimensions](multi-value-dimensions.md) varies depending on the context of their 
usage.
+
+When used as `VARCHAR` functions, which are not "aware" that their inputs 
which claim to be `VARCHAR` might actually have multiple
+values such as `CONCAT`, Druid will map the function across all values in the 
row. If the row is null or empty, the function will
+recieve `NULL` as its input, otherwise it will be applied to every row value 
and continue its life as a multi-value VARCHAR.
+
+When used with the explicit [multi-value string 
functions](./sql-multivalue-string-functions.md), the column is acknowledged to 
be multi-valued,
+and during processing the values are operated on as if they were `ARRAY` 
typed, so any operations which produce null and empty rows are
+distinguished as separate values (unlike implicit mapping behavior), but 
retain their `VARCHAR` type after the computation is complete.
+Note that Druid multi-value columns do _not_ distinguish between empty and 
null rows, so an empty row will never appear natively as input

Review Comment:
   ```suggestion
   Note that Druid multi-value columns do not distinguish between empty and 
null rows. An empty row never appears natively as an input to a multi-value 
function, but a multi-value function that manipulates the array form of the 
value may produce an empty array, which is handled separately while processing.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to