clintropolis commented on code in PR #13736:
URL: https://github.com/apache/druid/pull/13736#discussion_r1094073098


##########
docs/querying/multi-value-dimensions.md:
##########
@@ -147,6 +147,12 @@ only row1, and generate a result with three groups: `t1`, 
`t2`, and `t3`. If you
 your filter, you can use a [filtered 
dimensionSpec](dimensionspecs.md#filtered-dimensionspecs). This can also
 improve performance.
 
+## Unnesting
+
+You can unnest a column that contains multi-value dimensions (arrays) by using 
either the [UNNEST function (SQL)](../querying/sql.md#unnest) and the helper 
function MV_TO_ARRAY or the [`unnest` datasource 
(native)](../querying/datasource.md#unnest) .

Review Comment:
   > ... multi-value dimensions (arrays) ...
   
   please do not conflate multi-value dimensions with actual druid array types, 
they are totally separate internally. Multi-value strings are represented 
internally as the native `STRING` type and show up as `VARCHAR` in the SQL 
schema. Druid _also_ has `ARRAY` types, such as `ARRAY<STRING>`, `ARRAY<LONG>`, 
etc, that are not well documented yet _on purpose_ to not lock ourselves into a 
specific behavior prematurely.
   
   Also, its very nearly pointless to use `UNNEST` on a multi-value string 
column, because all multi-value strings have an implicit unnesting that occurs 
when grouping. The only case for using on a `STRING` is as part of a scan query
   



##########
docs/querying/sql-functions.md:
##########
@@ -1357,6 +1357,13 @@ Truncates a numerical expression to a specific number of 
decimal digits.
 
 Parses `expr` into a `COMPLEX<json>` object. This operator deserializes JSON 
values when processing them, translating stringified JSON into a nested 
structure. If the input is not a `VARCHAR` or it is invalid JSON, this function 
will result in a `NULL` value.
 
+## UNNEST
+
+`UNNEST(source)) as UNNESTED (target)`
+
+Unnests a source column that includes arrays (multi-value dimensions) into a 
target column.

Review Comment:
   again please don't conflate multi-value strings with arrays, they are not 
arrays, they just have special functions that allow interacting with them as if 
they were actual arrays.



##########
docs/querying/sql.md:
##########
@@ -82,6 +83,27 @@ FROM clause, metadata tables are not considered datasources. 
They exist only in
 For more information about table, lookup, query, and join datasources, refer 
to the [Datasources](datasource.md)
 documentation.
 
+## UNNEST
+
+The UNNEST clause unnests values stored in arrays within a column. It's the 
SQL equivalent to the [unnest datasource](./datasource.md#unnest).

Review Comment:
   unnest isn't limited to working on arrays in a column (we don't currently 
have any array typed columns). `MV_TO_ARRAY` effectively allows casting a 
multi-value string to an `ARRAY<STRING>`, but arrays could also come from other 
virtual columns, such as anything using the `ARRAY` constructor or anything 
that can build arrays, such as `ARRAY_AGG`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to