[ 
https://issues.apache.org/jira/browse/IMPALA-11750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17648082#comment-17648082
 ] 

ASF subversion and git services commented on IMPALA-11750:
----------------------------------------------------------

Commit 05a4b778d395c8813988610b78b71bcd920be037 in impala's branch 
refs/heads/master from Tamas Mate
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=05a4b778d ]

IMPALA-11339: Add Iceberg LOAD DATA INPATH statement

Extend LOAD DATA INPATH statement to support Iceberg tables. Native
parquet tables need Iceberg field ids, therefore to add files this
change uses child queries to load and rewrite the data. The child
queries create > insert > drop the temporary table over the specified
directory.

The create part depends on LIKE PARQUET/ORC clauses to infer the file
format. This requires identifying a file in the directory and using that
to create the temporary table.

The target file or directory is moved to a staging directory before
ingestion similar to native file formats. In case of a query failure the
files are moved back to the original location. Child query executor will
return the error message of the failing query and the child query
profiles will be available through the WebUI.

At this point the PARTITION clause it not supported because it would
require analysis of the PartitionSpec (IMPALA-11750).

Testing:
 - Added e2e tests
 - Added fe unit tests

Change-Id: I8499945fa57ea0499f65b455976141dcd6d789eb
Reviewed-on: http://gerrit.cloudera.org:8080/19145
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Iceberg LOAD DATA INPATH partition clause support
> -------------------------------------------------
>
>                 Key: IMPALA-11750
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11750
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: fe
>    Affects Versions: Impala 4.2.0
>            Reporter: Tamas Mate
>            Priority: Major
>
> To support the PARTITION clause of LOAD DATA INPATH the provided 
> PartitionSpec should be analysed or worked around. Once that is done, a 
> possible solution could be to modify the INSERT INTO SELECT statement to 
> ingest the specified partition value, ie.:
> {code:none}
> INSERT INTO ... SELECT part_1, * FROM tmp_table;
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to