Hello Tamas Mate, Gabor Kaszab, [email protected], Gergely Fürnstáhl, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18847

to look at the new patch set (#2).

Change subject: WIP: IMPALA-11484: Create SCAN plan for Iceberg V2 position 
delete tables
......................................................................

WIP: IMPALA-11484: Create SCAN plan for Iceberg V2 position delete tables

This patch adds support for reading Iceberg V2 tables use position
deletes. Equality deletes are still not supported. Position delete
files store the file path and file position of the deleted rows.

When an Iceberg table has position delete files we need to do an
ANTI JOIN between data files and delete files. From the data files
we need to query the virtual columns INPUT__FILE__NAME and
FILE__POSITION, while from the delete files we need the data columns
'file_path' and 'pos'. The latter data columns are added as 'hidden
columns' to Iceberg tables. 'Hidden column' is a new concept introduced
by this patch.

This patch introduces a new class 'IcebergScanPlanner' which has
the responsibility of doing a plan for Iceberg table scans. It creates
the aforementioned ANTI JOIN. Also, if there are data files without
corresponding delete files, we can have a separate SCAN node and its
results would be UNIONed to the rows coming from the ANTI JOIN:

              UNION
             /     \
    SCAN data       ANTI JOIN
                     /      \
              SCAN data    SCAN deletes

Predicate pushdown and time travel logic is transferred from
IcebergScanNode to IcebergScanPlanner.

TODO:
 * better cardinality estimates
 * add tests

Change-Id: I672cfee18d8e131772d90378d5b12ad4d0f7dd48
---
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetTblProperties.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/Column.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
A fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M 
testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_v2_delete_positional/data/00191-4-6e780302-527b-4911-8c6e-88d416adac57-00001.parquet
18 files changed, 850 insertions(+), 537 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/47/18847/2
--
To view, visit http://gerrit.cloudera.org:8080/18847
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I672cfee18d8e131772d90378d5b12ad4d0f7dd48
Gerrit-Change-Number: 18847
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>

Reply via email to