This is an automated email from the ASF dual-hosted git repository. boroknagyz pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/impala.git
commit 7f43afc2774ae2b363289b9958ff70e215efb5e6 Author: Daniel Becker <[email protected]> AuthorDate: Mon Dec 12 18:07:21 2022 +0100 IMPALA-11791: [DOCS] Document IMPALA-9499 query option IMPALA-9499 introduced the EXPAND_COMPLEX_TYPES query option which is documented in this change. Also updates docs/topics/impala_complex_types.xml - complex types are now allowed in the select list with the exceptions of collections embedded in structs and structs embedded in collections. Change-Id: I1f0a6b402de1ed9bb6aa05987a6ff8e6d62accb5 Reviewed-on: http://gerrit.cloudera.org:8080/19348 Tested-by: Impala Public Jenkins <[email protected]> Reviewed-by: Csaba Ringhofer <[email protected]> --- docs/impala.ditamap | 1 + docs/topics/impala_complex_types.xml | 89 +++++++++++++++++++++++------ docs/topics/impala_expand_complex_types.xml | 61 ++++++++++++++++++++ 3 files changed, 135 insertions(+), 16 deletions(-) diff --git a/docs/impala.ditamap b/docs/impala.ditamap index 7eac4c329..b8c3dad72 100644 --- a/docs/impala.ditamap +++ b/docs/impala.ditamap @@ -257,6 +257,7 @@ under the License. <topicref href="topics/impala_timezone.xml"/> <topicref href="topics/impala_topn_bytes_limit.xml"/> <topicref href="topics/impala_utf8_mode.xml"/> + <topicref href="topics/impala_expand_complex_types.xml"/> </topicref> </topicref> <topicref href="topics/impala_show.xml"/> diff --git a/docs/topics/impala_complex_types.xml b/docs/topics/impala_complex_types.xml index ba022f50d..9432f3b07 100644 --- a/docs/topics/impala_complex_types.xml +++ b/docs/topics/impala_complex_types.xml @@ -692,9 +692,9 @@ under the License. </p> <p> - Because the complex columns are omitted from the result set of an Impala <codeph>SELECT *</codeph> or <codeph>SELECT - <varname>col_name</varname></codeph> query, and because Impala currently does not support writing Parquet files with complex type - columns, you cannot use the <codeph>CREATE TABLE AS SELECT</codeph> syntax to create a table with nested type columns. + Because Impala currently does not support writing Parquet files with complex type columns, + you cannot use the <codeph>CREATE TABLE AS SELECT</codeph> syntax to create a table with + nested type columns. </p> <note> @@ -971,29 +971,86 @@ STORED AS PARQUET; <!-- Hive does the JSON output business: http://www.datascience-labs.com/hive/hiveql-data-manipulation/ --> -<!-- SELECT * works but skips any nested type coloumns. --> + <p> + The result set of an Impala query can contain both scalar and complex types. A query can + either retrieve the complex values directly or <q>unpack</q> the elements and fields + within a complex type using join queries, with the limitation that direct retrieval is + currently not supported for complex types where collections (maps or arrays) are + embedded within structs or structs are embedded within collections. + </p> + + <p> + Here are some complex types that are supported in the select list: + <ul> + <li> + <p> + <codeph>STRUCT<i: INT></codeph> + </p> + </li> + <li> + <p> + <codeph>STRUCT<s: STRUCT<i: INT>></codeph> + </p> + </li> + <li> + <p> + <codeph>ARRAY<INT></codeph> + </p> + </li> + <li> + <p> + <codeph>ARRAY<ARRAY<INT>></codeph> + </p> + </li> + <li> + <p> + <codeph>ARRAY<MAP<INT>></codeph> + </p> + </li> + </ul> + </p> <p> - The result set of an Impala query always contains all scalar types; the elements and fields within any complex type queries must - be <q>unpacked</q> using join queries. A query cannot directly retrieve the entire value for a complex type column. Impala - returns an error in this case. Queries using <codeph>SELECT *</codeph> are allowed for tables with complex types, but the - columns with complex types are skipped. + And here are some that are not supported in the select list: + <ul> + <li> + <p> + <codeph>STRUCT<a: ARRAY<INT>></codeph> + </p> + </li> + <li> + <p> + <codeph>ARRAY<STRUCT<i: INT>></codeph> + </p> + </li> + <li> + <p> + <codeph>MAP<INT, STRUCT<s: STRING>></codeph> + </p> + </li> + </ul> </p> <p> - The following example shows how referring directly to a complex type column returns an error, while <codeph>SELECT *</codeph> on - the same table succeeds, but only retrieves the scalar columns. + Because of backward compatibility with earlier versions of Impala that did not support + complex types in the result set, queries using <codeph>SELECT *</codeph> skip complex + types by default. To include complex types in <codeph>SELECT *</codeph> queries, set the + EXPAND_COMPLEX_TYPES query option to true (see the + <xref href="impala_expand_complex_types.xml"/>). </p> - <note conref="../shared/impala_common.xml#common/complex_type_schema_pointer"/> + <p> + The following example shows how referring directly to a column with a complex type where + a struct is embedded in a collection (an array) returns an error, while + <codeph>SELECT *</codeph> on the same table succeeds, but only retrieves the scalar + columns. Note that if EXPAND_COMPLEX_TYPES is true, the <codeph>SELECT *</codeph> query + also fails with the same error. + </p> -<!-- Original error message: -ERROR: AnalysisException: Expr 'c_orders' in select list returns a complex type 'ARRAY<STRUCT<o_orderkey:BIGINT,o_orderstatus:STRING,o_totalprice:DECIMAL(12,2),o_orderdate:STRING,o_orderpriority:STRING,o_clerk:STRING,o_shippriority:INT,o_comment:STRING,o_lineitems:ARRAY<STRUCT<l_partkey:BIGINT,l_suppkey:BIGINT,l_linenumber:INT,l_quantity:DECIMAL(12,2),l_extendedprice:DECIMAL(12,2),l_discount:DECIMAL(12,2),l_tax:DECIMAL(12,2),l_returnflag:STRING,l_linestatus:STRING,l_shipdate:STRING,l_com [...] ---> + <note conref="../shared/impala_common.xml#common/complex_type_schema_pointer"/> <codeblock><