[
https://issues.apache.org/jira/browse/IMPALA-13154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Quanlong Huang updated IMPALA-13154:
------------------------------------
Description:
In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables with
Highest Memory Requirements". However, not all tables are counted there. E.g.
after starting catalogd, run a DESCRIBE on a table to trigger metadata loading
on it. When it's done, the table is not shown in the WebUI.
The cause is that the list is only updated in HdfsTable.getTHdfsTable() when
'type' is
ThriftObjectType.FULL:
[https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]
This used to be the place that all code paths using the table will go to.
However, we've done bunch of optimizations to void getting the FULL thrift
object of the table, especially in LocalCatalog mode. We should move the code
of updating the list of largest tables somewhere that all table usages can
reach, e.g. after loading the metadata of the table, we can update its
estimatedMetadataSize.
was:
In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables with
Highest Memory Requirements". However, not all tables are counted there. E.g.
after starting catalogd, run a DESCRIBE on a table to trigger metadata loading
on it. When it's done, the table is not shown in the WebUI.
The cause is that the list is only updated in HdfsTable.getTHdfsTable() when
'type' is
ThriftObjectType.FULL:
[https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]
This used to be the place that all code paths using the table will go to.
However, we've done bunch of optimizations to not getting the FULL thrift
object of the table. We should move the code of updating the list of largest
tables somewhere that all table usages can reach, e.g. after loading the
metadata of the table, we can update its estimatedMetadataSize.
> Some tables are missing in Top-N Tables with Highest Memory Requirements
> ------------------------------------------------------------------------
>
> Key: IMPALA-13154
> URL: https://issues.apache.org/jira/browse/IMPALA-13154
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Quanlong Huang
> Priority: Major
> Labels: catalog-2024
>
> In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables
> with Highest Memory Requirements". However, not all tables are counted there.
> E.g. after starting catalogd, run a DESCRIBE on a table to trigger metadata
> loading on it. When it's done, the table is not shown in the WebUI.
> The cause is that the list is only updated in HdfsTable.getTHdfsTable() when
> 'type' is
> ThriftObjectType.FULL:
> [https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]
> This used to be the place that all code paths using the table will go to.
> However, we've done bunch of optimizations to void getting the FULL thrift
> object of the table, especially in LocalCatalog mode. We should move the code
> of updating the list of largest tables somewhere that all table usages can
> reach, e.g. after loading the metadata of the table, we can update its
> estimatedMetadataSize.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]