lipeng...@apache.org has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/19379 )
Change subject: IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance ...................................................................... IMPALA-11662: Improve 'refresh iceberg_tbl_on_oss' performance As the cost of directory listing on Cloud Storage Systems such as OSS or S3 is higher than the cost on HDFS, we could create the file descriptors from the rich metadata provided by Iceberg instead of using org.apache.hadoop.fs.FileSystem#listFiles. The only thing missing there is the last_modification_time of the files. But since Iceberg files are immutable, we could just come up with a special timestamp for these files. At the same time, we can also construct file descriptors ourselves during time travel to reduce the cost of requests with OSS services. Test: * existing tests * test on COS with my local test environment Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a --- M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java M fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java A fe/src/main/java/org/apache/impala/catalog/IcebergFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java M fe/src/main/java/org/apache/impala/catalog/iceberg/GroupedContentFiles.java M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java 9 files changed, 332 insertions(+), 107 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/19379/9 -- To view, visit http://gerrit.cloudera.org:8080/19379 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: If2ee8b6b7559e6590698b46ef1d574e55ed52f9a Gerrit-Change-Number: 19379 Gerrit-PatchSet: 9 Gerrit-Owner: Anonymous Coward <lipeng...@apache.org> Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com> Gerrit-Reviewer: Anonymous Coward <lipeng...@apache.org> Gerrit-Reviewer: Gergely Fürnstáhl <gfurnst...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tamas Mate <tma...@apache.org> Gerrit-Reviewer: Xiaoqing Gao <gaoxq...@gmail.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>