lipeng...@apache.org has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/19379 )

Change subject: IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance
......................................................................

IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance

Iceberg provides rich metadata, the cost of directory listing on OSS
service e.g. S3A is higher than the cost on HDFS, we could create the
file descriptors from Iceberg metadata instead of using
org.apache.hadoop.fs.FileSystem#listFiles. The only thing missing there
is the last_modification_time of the files. But since Iceberg files are
immutable, we could just come up with a special timestamp for these
files.

At the same time, we can also construct file descriptors ourselves
during time travel to reduce the cost of requests with OSS services.

Test:
 * existing tests
 * test on COS with my local test environment

Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
---
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/IcebergFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
8 files changed, 334 insertions(+), 100 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/19379/8
--
To view, visit http://gerrit.cloudera.org:8080/19379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
Gerrit-Change-Number: 19379
Gerrit-PatchSet: 8
Gerrit-Owner: Anonymous Coward <lipeng...@apache.org>
Gerrit-Reviewer: Anonymous Coward <lipeng...@apache.org>
Gerrit-Reviewer: Gergely Fürnstáhl <gfurnst...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Smith <michael.sm...@cloudera.com>
Gerrit-Reviewer: Tamas Mate <tma...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to