LiPenglin created IMPALA-11662:
----------------------------------
Summary: Improve "refresh iceberg_tbl_on_oss;" performance
Key: IMPALA-11662
URL: https://issues.apache.org/jira/browse/IMPALA-11662
Project: IMPALA
Issue Type: Improvement
Reporter: LiPenglin
Since Iceberg provides rich metadata, the cost of directory listing on OSS
service e.g. S3A is higher than the cost on HDFS, we could create the file
descriptors from Iceberg metadata instead of using
org.apache.hadoop.fs.FileSystem#listFiles.
https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java#L189.
The only thing missing there is the last_modification_time of the files. But
since Iceberg files are immutable, maybe we could just come up with a special
timestamp for these files.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]