Hello Daniel Becker, Zoltan Borok-Nagy, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/20595

to look at the new patch set (#3).

Change subject: IMPALA-11387: Introduce virtual column to expose Iceberg's 
file-level data sequence number
......................................................................

IMPALA-11387: Introduce virtual column to expose Iceberg's file-level data 
sequence number

Data sequence number is used for deciding whether an equality
delete file should be applied to a data file or not.

Iceberg has two different sequence numbers on a ContentFile level: file
and data sequence number.
>From Iceberg comments on ContentFile class:
https://github.com/apache/iceberg/blob/ebce8538db20fd13859b6af841cf433d9423b53c/api/src/main/java/org/apache/iceberg/ContentFile.java#L130
The file sequence number is always assigned at commit and cannot be
provided explicitly, unlike the data sequence number. The file
sequence number does not change upon assigning. In case of rewrite
(like compaction), file sequence number can be higher than the data
sequence number.
New snapshots can add files that belong to older sequence numbers
(e.g. compaction) where data sequence number remains the same as it was
in the older snapshot.

This patch adds data sequence number as a virtual column for Iceberg
tables and can be queried like:
SELECT ICEBERG__DATA__SEQUENCE__NUMBER FROM <iceberg_table>;

Testing:
  - Added E2E tests to exercise the new virtual column for V1, V2
    tables both partitioned and unpartitioned cases.

Change-Id: Id950e97782a2a29b505164470cfb646c5358dfca
---
M be/src/exec/file-metadata-utils.cc
M be/src/exec/hdfs-scan-node-base.cc
M common/fbs/IcebergObjects.fbs
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/VirtualColumn.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-virtual-columns.test
9 files changed, 213 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/20595/3
--
To view, visit http://gerrit.cloudera.org:8080/20595
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id950e97782a2a29b505164470cfb646c5358dfca
Gerrit-Change-Number: 20595
Gerrit-PatchSet: 3
Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to