317brian commented on code in PR #13736: URL: https://github.com/apache/druid/pull/13736#discussion_r1154920712
########## docs/querying/sql.md: ########## @@ -82,6 +83,43 @@ FROM clause, metadata tables are not considered datasources. They exist only in For more information about table, lookup, query, and join datasources, refer to the [Datasources](datasource.md) documentation. +## UNNEST + +> The UNNEST SQL function is [experimental](../development/experimental.md). Its API and behavior are subject +> to change in future releases. It is not recommended to use this feature in production at this time. + +The UNNEST clause unnests array values. It's the SQL equivalent to the [unnest datasource](./datasource.md#unnest). The source for UNNEST can be an array or an input that's been transformed into an array, such as with helper functions like MV_TO_ARRAY or ARRAY. + +The following is the general syntax for UNNEST, specifically a query that returns the column that gets unnested: + +```sql +SELECT column_alias_name FROM datasource, UNNEST(source_expression1) AS table_alias_name1(column_alias_name1), UNNEST(source_expression2) AS table_alias_name2(column_alias_name2), ... +``` + +* The `datasource` for UNNEST can be any Druid datasource, such as the following: + * A table, such as `FROM a_table`. + * A subset of a table based on a query, a filter, or a JOIN. For example, `FROM (SELECT columnA,columnB,columnC from a_table)`. +* The `source_expression` for the UNNEST function must be an array and can come from any expression. If the dimension you are unnesting is a multi-value dimension, you have to specify `MV_TO_ARRAY(dimension)` to convert it to an implicit ARRAY type. You can also specify any expression that has an SQL array datatype. For example, you can call UNNEST on the following: + * `ARRAY[dim1,dim2]` if you want to make an array out of two dimensions. + * `ARRAY_CONCAT(dim1,dim2)` if you want to concatenate two multi-value dimensions. +* The `AS table_alias_name(column_alias_name)` clause is not required but is highly recommended. Use it to specify the output, which can be an existing column or a new one. Replace `table_alias_name` and `column_alias_name` with a table and column name you want to alias the unnested results to. If you don't provide this, Druid uses a nondescriptive name, such as `EXPR$0`. + +Keep these two things in mind when writing your query: + +- You must include the context parameter `"enableUnnest": true`. +- You can unnest multiple source expressions in a single query. +- Notice the comma between the datasource and the UNNEST function. This is needed in most cases of the UNNEST function. Specifically, it is not needed when you're unnesting an inline array since the array itself is the datasource. +- If you view the native explanation of a SQL UNNEST, you'll notice that Druid uses `j0.unnest` as a virtual column to perform the unnest. An underscore is added for each unnest, so you may notice virtual columns named `_j0.unnest` or `__j0.unnest`. + Review Comment: ```suggestion - If you view the native explanation of a SQL UNNEST, you'll notice that Druid uses `j0.unnest` as a virtual column to perform the unnest. An underscore is added for each unnest, so you may notice virtual columns named `_j0.unnest` or `__j0.unnest`. - UNNEST preserves the ordering of the source array that is being unnested ``` ########## docs/querying/sql.md: ########## @@ -82,6 +83,43 @@ FROM clause, metadata tables are not considered datasources. They exist only in For more information about table, lookup, query, and join datasources, refer to the [Datasources](datasource.md) documentation. +## UNNEST + +> The UNNEST SQL function is [experimental](../development/experimental.md). Its API and behavior are subject +> to change in future releases. It is not recommended to use this feature in production at this time. + +The UNNEST clause unnests array values. It's the SQL equivalent to the [unnest datasource](./datasource.md#unnest). The source for UNNEST can be an array or an input that's been transformed into an array, such as with helper functions like MV_TO_ARRAY or ARRAY. + +The following is the general syntax for UNNEST, specifically a query that returns the column that gets unnested: + +```sql +SELECT column_alias_name FROM datasource, UNNEST(source_expression1) AS table_alias_name1(column_alias_name1), UNNEST(source_expression2) AS table_alias_name2(column_alias_name2), ... +``` + +* The `datasource` for UNNEST can be any Druid datasource, such as the following: + * A table, such as `FROM a_table`. + * A subset of a table based on a query, a filter, or a JOIN. For example, `FROM (SELECT columnA,columnB,columnC from a_table)`. +* The `source_expression` for the UNNEST function must be an array and can come from any expression. If the dimension you are unnesting is a multi-value dimension, you have to specify `MV_TO_ARRAY(dimension)` to convert it to an implicit ARRAY type. You can also specify any expression that has an SQL array datatype. For example, you can call UNNEST on the following: + * `ARRAY[dim1,dim2]` if you want to make an array out of two dimensions. + * `ARRAY_CONCAT(dim1,dim2)` if you want to concatenate two multi-value dimensions. +* The `AS table_alias_name(column_alias_name)` clause is not required but is highly recommended. Use it to specify the output, which can be an existing column or a new one. Replace `table_alias_name` and `column_alias_name` with a table and column name you want to alias the unnested results to. If you don't provide this, Druid uses a nondescriptive name, such as `EXPR$0`. + +Keep these two things in mind when writing your query: + +- You must include the context parameter `"enableUnnest": true`. +- You can unnest multiple source expressions in a single query. +- Notice the comma between the datasource and the UNNEST function. This is needed in most cases of the UNNEST function. Specifically, it is not needed when you're unnesting an inline array since the array itself is the datasource. +- If you view the native explanation of a SQL UNNEST, you'll notice that Druid uses `j0.unnest` as a virtual column to perform the unnest. An underscore is added for each unnest, so you may notice virtual columns named `_j0.unnest` or `__j0.unnest`. + Review Comment: ```suggestion - If you view the native explanation of a SQL UNNEST, you'll notice that Druid uses `j0.unnest` as a virtual column to perform the unnest. An underscore is added for each unnest, so you may notice virtual columns named `_j0.unnest` or `__j0.unnest`. - UNNEST preserves the ordering of the source array that is being unnested. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
