[ 
https://issues.apache.org/jira/browse/IMPALA-12406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noemi Pap-Takacs updated IMPALA-12406:
--------------------------------------
    Description: 
If an Iceberg table is frequently updated/written to in small batches, a lot of 
small files are created. This decreases read performance. Similarly, frequent 
row-level deletes contribute to this problem by creating delete files which 
have to be merged on read.

Currently INSERT OVERWRITE is used as a workaround to rewrite and compact 
Iceberg tables.

OPTIMIZE statement offers a new syntax and an Iceberg specific solution to this 
problem.

This patch introduces the new syntax as an alias for INSERT OVERWRITE.
{code:java}
Syntax: OPTIMIZE TABLE <table_name>;{code}

  was:
If an Iceberg table is frequently updated/written to in small batches, a lot of 
small files are created. This decreases read performance. Similarly, frequent 
row-level deletes contribute to this problem by creating delete files which 
have to be merged on read.

Currently INSERT OVERWRITE is used as a workaround to rewrite and compact 
Iceberg tables.

OPTIMIZE statement offers a new syntax and an Iceberg specific solution to this 
problem.

This patch introduces the new syntax as an alias for INSERT OVERWRITE.
{code:java}
Syntax: OPTIMIZE [TABLE] <table_name>;{code}


> OPTIMIZE statement as an alias for INSERT OVERWRITE
> ---------------------------------------------------
>
>                 Key: IMPALA-12406
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12406
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Frontend
>            Reporter: Noemi Pap-Takacs
>            Assignee: Noemi Pap-Takacs
>            Priority: Major
>              Labels: impala-iceberg
>
> If an Iceberg table is frequently updated/written to in small batches, a lot 
> of small files are created. This decreases read performance. Similarly, 
> frequent row-level deletes contribute to this problem by creating delete 
> files which have to be merged on read.
> Currently INSERT OVERWRITE is used as a workaround to rewrite and compact 
> Iceberg tables.
> OPTIMIZE statement offers a new syntax and an Iceberg specific solution to 
> this problem.
> This patch introduces the new syntax as an alias for INSERT OVERWRITE.
> {code:java}
> Syntax: OPTIMIZE TABLE <table_name>;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to