Hyunsik Choi created TAJO-2023:
----------------------------------
Summary: Eliminate the use of FileSystem::getContentSummary
Key: TAJO-2023
URL: https://issues.apache.org/jira/browse/TAJO-2023
Project: Tajo
Issue Type: Improvement
Components: QueryMaster, TajoMaster
Reporter: Hyunsik Choi
To get table volumes, {{FileSystem::getContentSummary}} is widely used in
TajoMaster and QueryMaster. It is used even multiple times for each query
lifecycle. But, This API causes lots of overhead, especially in S3 with
partitioned tables. The overhead also occurs in HDFS too with large partitioned
tables.
The main objective of this issue is to eliminate
{{Filesystem::getContentSummary}} as many as possible. This API is widely used
in many code points. So, it would be better to move forward this issue as an
umbrella issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)