[
https://issues.apache.org/jira/browse/IMPALA-13154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17914898#comment-17914898
]
ASF subversion and git services commented on IMPALA-13154:
----------------------------------------------------------
Commit 5371e0c6df3e329398712af3ebb739465b947454 in impala's branch
refs/heads/master from Fang-Yu Rao
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=5371e0c6d ]
IMPALA-13666: Provide a non-null fileMetadataStats for HdfsPartition
IMPALA-13154 added the method getFileMetadataStats() to
HdfsPartition.java that would return the file metadata statistics. The
method requires the corresponding HdfsPartition instance to have a
non-null field of 'fileMetadataStats_'.
This patch revises two existing constructors of HdfsPartition to provide
a non-null value for 'fileMetadataStats'. This makes it easier for a
third party extension to set up and update the field of
'fileMetadataStats_'. A third party extension has to update the field of
'fileMetadataStats_' if it would like to use this field to get the size
of the partition since all three fields in 'fileMetadataStats_' are
defaulted to 0.
A new constructor was also added for HdfsPartition that allows a third
party extension to provide their own FileMetadataStats when
instantiating an HdfsPartition. To facilitate instantiating a
FileMetadataStats, a new constructor was added for FileMetadataStats
that takes in a List of FileDescriptor's to construct a
FileMetadataStats.
Change-Id: I7e690729fcaebb1e380cc61f2b746783c86dcbf7
Reviewed-on: http://gerrit.cloudera.org:8080/22340
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Some tables are missing in Top-N Tables with Highest Memory Requirements
> ------------------------------------------------------------------------
>
> Key: IMPALA-13154
> URL: https://issues.apache.org/jira/browse/IMPALA-13154
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Xuebin Su
> Priority: Major
> Labels: catalog-2024
> Fix For: Impala 4.5.0
>
>
> In the /catalog page of catalogd WebUI, there is a table for "Top-N Tables
> with Highest Memory Requirements". However, not all tables are counted there.
> E.g. after starting catalogd, run a DESCRIBE on a table to trigger metadata
> loading on it. When it's done, the table is not shown in the WebUI.
> The cause is that the list is only updated in HdfsTable.getTHdfsTable() when
> 'type' isĀ
> ThriftObjectType.FULL:
> [https://github.com/apache/impala/blob/ee21427d26620b40d38c706b4944d2831f84f6f5/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2457-L2459]
> This used to be the place that all code paths using the table will go to.
> However, we've done bunch of optimizations to void getting the FULL thrift
> object of the table, especially in LocalCatalog mode. We should move the code
> of updating the list of largest tables somewhere that all table usages can
> reach, e.g. after loading the metadata of the table, we can update its
> estimatedMetadataSize.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]