vtlim commented on code in PR #13736: URL: https://github.com/apache/druid/pull/13736#discussion_r1103378700
########## docs/tutorials/tutorial-unnest-arrays.md: ########## @@ -140,9 +142,93 @@ Now that the data is loaded, run the following query: SELECT * FROM nested_data ``` -In the results, notice that the column named `dim3` has nested values like `["a","b"]`. The example queries that follow unnest `dim3` and run queries, such as Scan. +In the results, notice that the column named `dim3` has nested values like `["a","b"]`. The example queries that follow unnest `dim3` and run queries against the unnested records. Depending on the type of queries you write, see either [Unnest using SQL queries](#unnest-using-sql-queries) or [Unnest using native queries](#unnest-using-native-queries). + +## Unnest using SQL queries + +The following is the general syntax for UNNEST: + +```sql +SELECT column_alias_name FROM datasource, UNNEST(source_expression) AS table_alias_name(column_alias_name) +``` + +For more information about the syntax, see [UNNEST](../querying/sql.md#unnest). + +### Unnest inline array + +The following query returns a column that unnests the array `[1,2,3]` that is provided inline: + +```sql +SELECT * FROM UNNEST(ARRAY[1,2,3]) +``` + +If you unnest that same inline array while using a table as the datasource, Druid treats this as a JOIN between a left datasource and a constant datasource. For example: + +```sql +SELECT longs FROM nested_data, UNNEST(ARRAY[1,2,3]) AS example_table (longs) +``` + +### Unnest a single column in a table + +The following query returns a column called `d3` from the table `nested_data`. `d3` contains the unnested values from the source column `dim3`: + +```sql +SELECT d3 FROM "nested_data", UNNEST(MV_TO_ARRAY(dim3)) AS example_table (d3) +``` + +Notice the MV_TO_ARRAY helper function, which converts the multi-value records in `dim3` to arrays. It is required since `dim3` is a multi-value string dimension. + +If the column you are unnesting is not a string dimension, then you do not need to use the MV_TO_ARRAY helper function. + +### Unnest a virtual column + +You can unnest into a virtual column (multiple columns treated as one). The following query returns the two source columns and a third virtual column containing the unnested data: + +```sql +SELECT dim4,dim5,d45 FROM nested_data, UNNEST(ARRAY[dim4,dim5]) AS example_table (d45) +``` + +This virtual column is the product of the two source columns. Notice how the total number of rows has grown. The table `nested_data` had only seven rows originally. + +Another way to unnest a virtual column is to concatenate them with ARRAY_CONCAT: + +```sql +SELECT d45 FROM nested_data, UNNEST(ARRAY_CONCAT(dim4,dim5)) AS example_table (d45) +``` + +Decide which method to use based on what your goals are. + +### Unnest a column from a subset of a table + +The following query uses only three columns from the `nested_data` table as the datasource. From that subset, it unnests the column `dim3` into `d3` and returns `d3`. + +```sql +SELECT d3 FROM (select dim1, dim2, dim3 from "nested_data"), UNNEST(MV_TO_ARRAY(dim3)) AS example_table (d3) Review Comment: ```suggestion SELECT d3 FROM (SELECT dim1, dim2, dim3 FROM "nested_data"), UNNEST(MV_TO_ARRAY(dim3)) AS example_table (d3) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
