[ 
https://issues.apache.org/jira/browse/IMPALA-13154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890766#comment-17890766
 ] 

Wenzhe Zhou edited comment on IMPALA-13154 at 10/18/24 5:30 AM:
----------------------------------------------------------------

Thanks [~stigahuang] to look into this issue.
Have a few question:
What are the cases where HdfsTable.load() is not called for the table? Iceberg 
table calls HdfsTable.getTHdfsTable() directly.
Do we need to re-calculate fileMetadataStats_ in HdfsTable.load()?
Could we still set the metrics in HdfsTable.getTHdfsTable(), but change the 
code to calculate estimatedMetadataSize for type' not equal to 
ThriftObjectType.FULL? 


was (Author: wzhou):
Thanks [~stigahuang] to look into this issue.
Have a few question:
What are the cases where HdfsTable.load() is not called for the table?
Do we need to re-calculate fileMetadataStats_ in HdfsTable.load()?
Could we still set the metrics in HdfsTable.getTHdfsTable(), but change the 
code to calculate estimatedMetadataSize for type' not equal to 
ThriftObjectType.FULL? 

> Some tables are missing in Top-N Tables with Highest Memory Requirements
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-13154
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13154
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Priority: Major
>              Labels: catalog-2024
>
> In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables 
> with Highest Memory Requirements". However, not all tables are counted there. 
> E.g. after starting catalogd, run a DESCRIBE on a table to trigger metadata 
> loading on it. When it's done, the table is not shown in the WebUI.
> The cause is that the list is only updated in HdfsTable.getTHdfsTable() when 
> 'type' isĀ 
> ThriftObjectType.FULL:
> [https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]
> This used to be the place that all code paths using the table will go to. 
> However, we've done bunch of optimizations to void getting the FULL thrift 
> object of the table, especially in LocalCatalog mode. We should move the code 
> of updating the list of largest tables somewhere that all table usages can 
> reach, e.g. after loading the metadata of the table, we can update its 
> estimatedMetadataSize.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to