[ 
https://issues.apache.org/jira/browse/IMPALA-13154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17914898#comment-17914898
 ] 

ASF subversion and git services commented on IMPALA-13154:
----------------------------------------------------------

Commit 5371e0c6df3e329398712af3ebb739465b947454 in impala's branch 
refs/heads/master from Fang-Yu Rao
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5371e0c6d ]

IMPALA-13666: Provide a non-null fileMetadataStats for HdfsPartition

IMPALA-13154 added the method getFileMetadataStats() to
HdfsPartition.java that would return the file metadata statistics. The
method requires the corresponding HdfsPartition instance to have a
non-null field of 'fileMetadataStats_'.

This patch revises two existing constructors of HdfsPartition to provide
a non-null value for 'fileMetadataStats'. This makes it easier for a
third party extension to set up and update the field of
'fileMetadataStats_'. A third party extension has to update the field of
'fileMetadataStats_' if it would like to use this field to get the size
of the partition since all three fields in 'fileMetadataStats_' are
defaulted to 0.

A new constructor was also added for HdfsPartition that allows a third
party extension to provide their own FileMetadataStats when
instantiating an HdfsPartition. To facilitate instantiating a
FileMetadataStats, a new constructor was added for FileMetadataStats
that takes in a List of FileDescriptor's to construct a
FileMetadataStats.

Change-Id: I7e690729fcaebb1e380cc61f2b746783c86dcbf7
Reviewed-on: http://gerrit.cloudera.org:8080/22340
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Some tables are missing in Top-N Tables with Highest Memory Requirements
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-13154
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13154
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog
>            Reporter: Quanlong Huang
>            Assignee: Xuebin Su
>            Priority: Major
>              Labels: catalog-2024
>             Fix For: Impala 4.5.0
>
>
> In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables 
> with Highest Memory Requirements". However, not all tables are counted there. 
> E.g. after starting catalogd, run a DESCRIBE on a table to trigger metadata 
> loading on it. When it's done, the table is not shown in the WebUI.
> The cause is that the list is only updated in HdfsTable.getTHdfsTable() when 
> 'type' isĀ 
> ThriftObjectType.FULL:
> [https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]
> This used to be the place that all code paths using the table will go to. 
> However, we've done bunch of optimizations to void getting the FULL thrift 
> object of the table, especially in LocalCatalog mode. We should move the code 
> of updating the list of largest tables somewhere that all table usages can 
> reach, e.g. after loading the metadata of the table, we can update its 
> estimatedMetadataSize.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to