[ 
https://issues.apache.org/jira/browse/IMPALA-13154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17909431#comment-17909431
 ] 

ASF subversion and git services commented on IMPALA-13154:
----------------------------------------------------------

Commit 731c16c73ab20cdd0d4a03e24a1a898b0489c7ed in impala's branch 
refs/heads/master from Xuebin Su
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=731c16c73 ]

IMPALA-13154: Update metrics when loading an HDFS table

Previously, some table metrics, such as the estimated memory usage
and the number of files, were only updated when a "FULL" Thrift object
of the table is requested. As a result, if a user ran a DESCRIBE
command on a table, and then tried to find the table on the Top-N page
of the web UI, the user would not find it.

This patch fixes the issue by updating the table metrics as soon as
an HDFS table is loaded. With this, no matter what Thrift object type of
the table is requested, the metrics will always be updated and
displayed on the web UI.

Testing:
- Added two custom cluster tests in test_web_pages.py to make sure that
  table stats can be viewed on the web UI after DESCRIBE, for both
  legacy and local catalog modes.

Change-Id: I6e2eb503b0f61b1e6403058bc5dc78d721e7e940
Reviewed-on: http://gerrit.cloudera.org:8080/22014
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Some tables are missing in Top-N Tables with Highest Memory Requirements
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-13154
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13154
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Xuebin Su
>            Priority: Major
>              Labels: catalog-2024
>
> In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables 
> with Highest Memory Requirements". However, not all tables are counted there. 
> E.g. after starting catalogd, run a DESCRIBE on a table to trigger metadata 
> loading on it. When it's done, the table is not shown in the WebUI.
> The cause is that the list is only updated in HdfsTable.getTHdfsTable() when 
> 'type' isĀ 
> ThriftObjectType.FULL:
> [https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]
> This used to be the place that all code paths using the table will go to. 
> However, we've done bunch of optimizations to void getting the FULL thrift 
> object of the table, especially in LocalCatalog mode. We should move the code 
> of updating the list of largest tables somewhere that all table usages can 
> reach, e.g. after loading the metadata of the table, we can update its 
> estimatedMetadataSize.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to