Noemi Pap-Takacs has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/23838 )

Change subject: IMPALA-14564: Remove redundant partition info from Iceberg file 
descriptors
......................................................................

IMPALA-14564: Remove redundant partition info from Iceberg file descriptors

Iceberg file descriptors used to contain information about the
partition they belong to: the spec id and the partition values.
These fields uniquely identify the partition the file belongs
to and are only dependent on the partition not the file itself.
It means that it is redundant to store these fields in each
file descriptor in the Catalog.
Instead, the partition information is stored separately in the
IcebergContentFileStore in FlatBuffers binary format. File
descriptors only store a unique id of the partition they belong to.

Since the scan nodes need the partition information for execution,
we look up the relevant partitions by their ids in the Planner, put
them in the file descriptors that are sent to the executors.

This change introduces a separate FlatBuffer schema to serialize
necessary Iceberg file metadata from the Catalog to the Frontend
and from the Frontend to the executors (in scan ranges) containing
only the relevant fields for maximum memory efficiency.

Testing: ran existing e2e tests

Generated by Copilot (Claude Sonnet 4.6)
Change-Id: I57c2fd6f1ebb636aa9e7ca925413ca51858cbc2a
---
M be/src/exec/file-metadata-utils.cc
M be/src/exec/file-metadata-utils.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M common/fbs/CatalogObjects.fbs
M common/fbs/IcebergObjects.fbs
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java
M fe/src/main/java/org/apache/impala/catalog/IcebergDeleteTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergFileDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/IcebergFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
16 files changed, 268 insertions(+), 116 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/23838/8
--
To view, visit http://gerrit.cloudera.org:8080/23838
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I57c2fd6f1ebb636aa9e7ca925413ca51858cbc2a
Gerrit-Change-Number: 23838
Gerrit-PatchSet: 8
Gerrit-Owner: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>

Reply via email to