[ 
https://issues.apache.org/jira/browse/IMPALA-13154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17889635#comment-17889635
 ] 

Zoltán Borók-Nagy commented on IMPALA-13154:
--------------------------------------------

Thanks Quanlong for filing this. The same applies to Top-N Tables with Most 
Number of Files.

> Some tables are missing in Top-N Tables with Highest Memory Requirements
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-13154
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13154
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Priority: Major
>              Labels: catalog-2024
>
> In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables 
> with Highest Memory Requirements". However, not all tables are counted there. 
> E.g. after starting catalogd, run a DESCRIBE on a table to trigger metadata 
> loading on it. When it's done, the table is not shown in the WebUI.
> The cause is that the list is only updated in HdfsTable.getTHdfsTable() when 
> 'type' is 
> ThriftObjectType.FULL:
> [https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]
> This used to be the place that all code paths using the table will go to. 
> However, we've done bunch of optimizations to not getting the FULL thrift 
> object of the table. We should move the code of updating the list of largest 
> tables somewhere that all table usages can reach, e.g. after loading the 
> metadata of the table, we can update its estimatedMetadataSize.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to