[
https://issues.apache.org/jira/browse/IMPALA-14502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Quanlong Huang updated IMPALA-14502:
------------------------------------
Description:
In a catalogd heap dump where all of the tables are unloaded, we found
IncompleteTable consumes more memory space than just the strings of db/table
name and table type/comment.
!Histogram.png|width=676,height=436!
As shown in the histogram, there are 2.6M instances of IncompleteTable
consuming around 18GB of the heap space. Each instance takes around 7KB of
memory.
Looking into the dominator tree (group by classes) of IncompleteTable
instances, the majority of the space is consumed by Metrics which will never be
used for IncompleteTable (see
[code|https://github.com/apache/impala/blob/ebbc67cf40bd856253d07c649028888d85c772cc/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L4242-L4244]).
!Dominator.png|width=732,height=618!
We should ignore initializing these metrics for IncompleteTable.
{code:java}
public void initMetrics() {
metrics_.addTimer(REFRESH_DURATION_METRIC);
metrics_.addTimer(ALTER_DURATION_METRIC);
metrics_.addTimer(LOAD_DURATION_METRIC);
metrics_.addTimer(LOAD_DURATION_STORAGE_METADATA);
metrics_.addTimer(HMS_LOAD_TBL_SCHEMA);
metrics_.addTimer(LOAD_DURATION_ALL_COLUMN_STATS);
metrics_.addCounter(NUMBER_OF_INFLIGHT_EVENTS);
metrics_.addTimer(TBL_EVENTS_PROCESS_DURATION);
metrics_.addGauge(LAST_SYNC_EVENT_ID,
(Gauge<Long>) () -> Long.valueOf(lastSyncedEventId_));
}{code}
was:
In a catalogd heap dump where all of the tables are unloaded, we found
IncompleteTable consumes more memory space than just the strings of db/table
name and table type/comment.
As shown in the histogram, there are 2.6M instances of IncompleteTable
consuming around 18GB of the heap space. Each instance takes around 7KB of
memory.
Looking into the dominator tree (group by classes) of IncompleteTable
instances, the majority of the space is consumed by Metrics which will never be
used for IncompleteTable (see
[code|https://github.com/apache/impala/blob/ebbc67cf40bd856253d07c649028888d85c772cc/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L4242-L4244]).
We should ignore initializing these metrics for IncompleteTable.
{code}
public void initMetrics() {
metrics_.addTimer(REFRESH_DURATION_METRIC);
metrics_.addTimer(ALTER_DURATION_METRIC);
metrics_.addTimer(LOAD_DURATION_METRIC);
metrics_.addTimer(LOAD_DURATION_STORAGE_METADATA);
metrics_.addTimer(HMS_LOAD_TBL_SCHEMA);
metrics_.addTimer(LOAD_DURATION_ALL_COLUMN_STATS);
metrics_.addCounter(NUMBER_OF_INFLIGHT_EVENTS);
metrics_.addTimer(TBL_EVENTS_PROCESS_DURATION);
metrics_.addGauge(LAST_SYNC_EVENT_ID,
(Gauge<Long>) () -> Long.valueOf(lastSyncedEventId_));
}{code}
> Redundant Metrics in IncompleteTable consuming extra space
> ----------------------------------------------------------
>
> Key: IMPALA-14502
> URL: https://issues.apache.org/jira/browse/IMPALA-14502
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Critical
> Attachments: Dominator.png, Histogram.png
>
>
> In a catalogd heap dump where all of the tables are unloaded, we found
> IncompleteTable consumes more memory space than just the strings of db/table
> name and table type/comment.
> !Histogram.png|width=676,height=436!
> As shown in the histogram, there are 2.6M instances of IncompleteTable
> consuming around 18GB of the heap space. Each instance takes around 7KB of
> memory.
> Looking into the dominator tree (group by classes) of IncompleteTable
> instances, the majority of the space is consumed by Metrics which will never
> be used for IncompleteTable (see
> [code|https://github.com/apache/impala/blob/ebbc67cf40bd856253d07c649028888d85c772cc/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java#L4242-L4244]).
> !Dominator.png|width=732,height=618!
> We should ignore initializing these metrics for IncompleteTable.
> {code:java}
> public void initMetrics() {
> metrics_.addTimer(REFRESH_DURATION_METRIC);
> metrics_.addTimer(ALTER_DURATION_METRIC);
> metrics_.addTimer(LOAD_DURATION_METRIC);
> metrics_.addTimer(LOAD_DURATION_STORAGE_METADATA);
> metrics_.addTimer(HMS_LOAD_TBL_SCHEMA);
> metrics_.addTimer(LOAD_DURATION_ALL_COLUMN_STATS);
> metrics_.addCounter(NUMBER_OF_INFLIGHT_EVENTS);
> metrics_.addTimer(TBL_EVENTS_PROCESS_DURATION);
> metrics_.addGauge(LAST_SYNC_EVENT_ID,
> (Gauge<Long>) () -> Long.valueOf(lastSyncedEventId_));
> }{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]