Zoltan Borok-Nagy has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19379 )

Change subject: IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance
......................................................................

IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance

As the cost of directory listing on Cloud Storage Systems such as OSS or
S3 is higher than the cost on HDFS, we could create the file descriptors
from the rich metadata provided by Iceberg instead of using
org.apache.hadoop.fs.FileSystem#listFiles. The only thing missing there
is the last_modification_time of the files. But since Iceberg files are
immutable, we could just come up with a special timestamp for these
files.

At the same time, we can also construct file descriptors ourselves
during time travel to reduce the cost of requests with OSS services.

Test:
 * existing tests
 * test on COS with my local test environment

Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
Reviewed-on: http://gerrit.cloudera.org:8080/19379
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Zoltan Borok-Nagy <[email protected]>
---
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
A fe/src/main/java/org/apache/impala/catalog/IcebergFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
9 files changed, 332 insertions(+), 107 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Zoltan Borok-Nagy: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/19379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a
Gerrit-Change-Number: 19379
Gerrit-PatchSet: 13
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Andrew Sherman <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Xiaoqing Gao <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>

Reply via email to