[
https://issues.apache.org/jira/browse/HADOOP-13362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15370899#comment-15370899
]
Junping Du commented on HADOOP-13362:
-------------------------------------
bq. Note that we've not seen the uncaught exception issue described in
YARN-5190 on 2.7, probably because 2.7 doesn't have YARN-4906.
Agree. After YARN-4906, things a bit tricky here is: we are calling
ContainerMetrics.forContainer() (and unregister) two times: once in
ContainerImpl and the other one in ContainerMonitorImpl. And the fix in
YARN-1643 has issue in this case because it will call register a metrics again
before calling finish it.
However, I am still suspecting only backport part of YARN-5190 is enough as I
didn't see where we call ContainerMetrics.finish() in 2.7.3. Do I miss anything
here?
> DefaultMetricsSystem leaks the source name when a source unregisters
> --------------------------------------------------------------------
>
> Key: HADOOP-13362
> URL: https://issues.apache.org/jira/browse/HADOOP-13362
> Project: Hadoop Common
> Issue Type: Bug
> Components: metrics
> Affects Versions: 2.7.2
> Reporter: Jason Lowe
> Priority: Critical
>
> Ran across a nodemanager that was spending most of its time in GC. Upon
> examination of the heap most of the memory was going to the map of names in
> org.apache.hadoop.metrics2.lib.UniqueNames. In this case the map had almost
> 2 million entries. Looking at a few of the map showed entries like
> "ContainerResource_container_e01_1459548490386_8560138_01_002020",
> "ContainerResource_container_e01_1459548490386_2378745_01_000410", etc.
> Looks like the ContainerMetrics for each container will cause a unique name
> to be registered with UniqueNames and the name will never be unregistered.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]