[ 
https://issues.apache.org/jira/browse/IMPALA-13154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang updated IMPALA-13154:
------------------------------------
    Description: 
In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables with 
Highest Memory Requirements". However, not all tables are counted there. E.g. 
after starting catalogd, run a DESCRIBE on a table to trigger metadata loading 
on it. When it's done, the table is not shown in the WebUI.

The cause is that the list is only updated in HdfsTable.getTHdfsTable() when 
'type' is 
ThriftObjectType.FULL:
[https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]

This used to be the place that all code paths using the table will go to. 
However, we've done bunch of optimizations to void getting the FULL thrift 
object of the table, especially in LocalCatalog mode. We should move the code 
of updating the list of largest tables somewhere that all table usages can 
reach, e.g. after loading the metadata of the table, we can update its 
estimatedMetadataSize.

  was:
In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables with 
Highest Memory Requirements". However, not all tables are counted there. E.g. 
after starting catalogd, run a DESCRIBE on a table to trigger metadata loading 
on it. When it's done, the table is not shown in the WebUI.

The cause is that the list is only updated in HdfsTable.getTHdfsTable() when 
'type' is 
ThriftObjectType.FULL:
[https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]

This used to be the place that all code paths using the table will go to. 
However, we've done bunch of optimizations to not getting the FULL thrift 
object of the table. We should move the code of updating the list of largest 
tables somewhere that all table usages can reach, e.g. after loading the 
metadata of the table, we can update its estimatedMetadataSize.


> Some tables are missing in Top-N Tables with Highest Memory Requirements
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-13154
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13154
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Priority: Major
>              Labels: catalog-2024
>
> In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables 
> with Highest Memory Requirements". However, not all tables are counted there. 
> E.g. after starting catalogd, run a DESCRIBE on a table to trigger metadata 
> loading on it. When it's done, the table is not shown in the WebUI.
> The cause is that the list is only updated in HdfsTable.getTHdfsTable() when 
> 'type' is 
> ThriftObjectType.FULL:
> [https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]
> This used to be the place that all code paths using the table will go to. 
> However, we've done bunch of optimizations to void getting the FULL thrift 
> object of the table, especially in LocalCatalog mode. We should move the code 
> of updating the list of largest tables somewhere that all table usages can 
> reach, e.g. after loading the metadata of the table, we can update its 
> estimatedMetadataSize.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to