Noemi Pap-Takacs has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20866


Change subject: IMPALA-12412: Support partition evolution in OPTIMIZE statement
......................................................................

IMPALA-12412: Support partition evolution in OPTIMIZE statement

OPTIMIZE statement is used to execute table maintenance tasks in
Iceberg tables, such as:
 1. compacting small files,
 2. merging delete deltas,
 3. rewriting the table according to the latest partition spec.

OptimizeStmt creates a source statement that contains all
columns of the table. All table content will be rewritten
to new data files. After the executors finished writing,
the Catalog calls RewriteFiles Iceberg API to commit the changes.
All previous data and delete files will be excluded from,
and all newly written data files will be added to the next
snapshot. The old files remain accessible via time travel
to older snapshots of the table.

By default, Impala has as many file writers as instances and
therefore writes at least that many files.
For smaller tables this can be limited by setting
MAX_FS_WRITERS Query Option.

Authorization: OPTIMIZE TABLE requires ALL privileges.

Limitations:
All limitations about writing Iceberg tables apply.

Testing:
 - E2E tests:
        - schema evolution
        - partition evolution
        - UPDATE/DELETE
        - time travel
        - table history
 - negative tests
 - Ranger tests for authorization
 - FE: Planner test:
        - sorting order
        - MAX_FS_WRITER
        - partitioned exchange

Change-Id: I65a0c8529d274afff38ccd582f1b8a857716b1b5
---
M be/src/service/client-request-state.cc
M common/thrift/Types.thrift
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/IcebergUpdateImpl.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/OptimizeStmt.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/PlannerContext.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/service/IcebergCatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-optimize.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by-zorder.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M testdata/workloads/functional-query/queries/QueryTest/iceberg-optimize.test
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
M tests/query_test/test_iceberg.py
19 files changed, 537 insertions(+), 189 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/66/20866/5
--
To view, visit http://gerrit.cloudera.org:8080/20866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I65a0c8529d274afff38ccd582f1b8a857716b1b5
Gerrit-Change-Number: 20866
Gerrit-PatchSet: 5
Gerrit-Owner: Noemi Pap-Takacs <npaptak...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to