[
https://issues.apache.org/jira/browse/IMPALA-13154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17891090#comment-17891090
]
Wenzhe Zhou commented on IMPALA-13154:
--------------------------------------
[~stigahuang] Have a few more questions:
1) HdfsTable.load() does not always performs a full metadata load. If it's
called for non full metadata load, file metadata may not be loaded for all
partitions. In this case, we cannot calculate estimatedMetadataSize and total
number of files to update metrics, or the results are far from the real size.
2) Could we keep the old behavior when showing "Top-N Tables with Highest
Memory Requirements" and "Top-N Tables Most Number of Files" in WebUI? If
tables without fully loaded metadata are counted in the table list, the results
shown in the WebUI are not accurate.
> Some tables are missing in Top-N Tables with Highest Memory Requirements
> ------------------------------------------------------------------------
>
> Key: IMPALA-13154
> URL: https://issues.apache.org/jira/browse/IMPALA-13154
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Quanlong Huang
> Priority: Major
> Labels: catalog-2024
>
> In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables
> with Highest Memory Requirements". However, not all tables are counted there.
> E.g. after starting catalogd, run a DESCRIBE on a table to trigger metadata
> loading on it. When it's done, the table is not shown in the WebUI.
> The cause is that the list is only updated in HdfsTable.getTHdfsTable() when
> 'type' isĀ
> ThriftObjectType.FULL:
> [https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]
> This used to be the place that all code paths using the table will go to.
> However, we've done bunch of optimizations to void getting the FULL thrift
> object of the table, especially in LocalCatalog mode. We should move the code
> of updating the list of largest tables somewhere that all table usages can
> reach, e.g. after loading the metadata of the table, we can update its
> estimatedMetadataSize.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]