This is an automated email from the ASF dual-hosted git repository. stigahuang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/impala.git
commit cf6165aca8a8434e8bdd7861eec55c7fd5e2c231 Author: Tamas Mate <[email protected]> AuthorDate: Tue Jan 3 21:29:04 2023 +0100 IMPALA-11819: [DOCS] Add Iceberg LOAD DATA information This commit adds information on how LOAD DATA statement can be used with Iceberg tables. Testing: - Built docs locally Change-Id: Iec242781a4551aa04e4e920e3f3a1010c7ab808e Reviewed-on: http://gerrit.cloudera.org:8080/19396 Tested-by: Impala Public Jenkins <[email protected]> Reviewed-by: Gergely Fürnstáhl <[email protected]> Reviewed-by: Noemi Pap-Takacs <[email protected]> Reviewed-by: Tamas Mate <[email protected]> --- docs/shared/impala_common.xml | 9 +++++++++ docs/topics/impala_iceberg.xml | 18 ++++++++++++++++++ docs/topics/impala_load_data.xml | 3 +++ 3 files changed, 30 insertions(+) diff --git a/docs/shared/impala_common.xml b/docs/shared/impala_common.xml index ee4b6877e..ee8d9fbe5 100644 --- a/docs/shared/impala_common.xml +++ b/docs/shared/impala_common.xml @@ -3364,6 +3364,15 @@ flight_num: INT32 SNAPPY DO:83456393 FPO:83488603 SZ:10216514/11474301 <b>HBase considerations:</b> This data type cannot be used with HBase tables. </p> + <p id="iceberg_blurb"> + <b>Iceberg considerations:</b> + </p> + + <p id="iceberg_load_data"> + See <xref href="../topics/impala_iceberg.xml#iceberg_load"/> for details about + <codeph>LOAD DATA</codeph> with Iceberg. + </p> + <p id="internals_blurb"> <b>Internal details:</b> </p> diff --git a/docs/topics/impala_iceberg.xml b/docs/topics/impala_iceberg.xml index 1ae133623..5366e9ffc 100644 --- a/docs/topics/impala_iceberg.xml +++ b/docs/topics/impala_iceberg.xml @@ -442,6 +442,24 @@ INSERT INTO ice_p VALUES (1, 2); </conbody> </concept> + <concept id="iceberg_load"> + <title>Loading data into Iceberg tables</title> + <conbody> + <p> + <codeph>LOAD DATA</codeph> statement can be used to load a single file or directory into + an existing Iceberg table. This operation is executed differently compared to HMS tables, the + data is being inserted into the table via sequentially executed statements, which has + some limitations: + <ul> + <li>Only Parquet or ORC files can be loaded.</li> + <li><codeph>PARTITION</codeph> clause is not supported, but the partition transformations + are respected.</li> + <li>The loaded files will be re-written as Parquet files.</li> + </ul> + </p> + </conbody> + </concept> + <concept id="iceberg_time_travel"> <title>Time travel for Iceberg tables</title> <conbody> diff --git a/docs/topics/impala_load_data.xml b/docs/topics/impala_load_data.xml index f947534a7..eb9f64856 100644 --- a/docs/topics/impala_load_data.xml +++ b/docs/topics/impala_load_data.xml @@ -258,6 +258,9 @@ Returned 1 row(s) in 0.62s</codeblock> <p conref="../shared/impala_common.xml#common/hbase_blurb"/> <p conref="../shared/impala_common.xml#common/hbase_no_load_data"/> + <p conref="../shared/impala_common.xml#common/iceberg_blurb"/> + <p conref="../shared/impala_common.xml#common/iceberg_load_data"/> + <p conref="../shared/impala_common.xml#common/related_info"/> <p> The <codeph>LOAD DATA</codeph> statement is an alternative to the
