[ 
https://issues.apache.org/jira/browse/YARN-6212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116452#comment-16116452
 ] 

Yang Wang edited comment on YARN-6212 at 8/8/17 2:25 AM:
---------------------------------------------------------

Hi, [~miklos.szeg...@cloudera.com]
I'm afraid this JIRA is not a duplicate of YARN-3933.
The primary cause of negative values is that metrics do not recover properly 
when NM restart.
*AllocatedContainers,ContainersLaunched,AllocatedGB,AvailableGB,AllocatedVCores,AvailableVCores*
 in metrics need to recover when NM restart.
This should be done in ContainerManagerImpl#recoverContainer.

The scenario could be reproduction by the following steps:
# Make sure 
YarnConfiguration.NM_RECOVERY_ENABLED=true,YarnConfiguration.NM_RECOVERY_SUPERVISED=true
 in NM
# Submit an application and keep running
# Restart NM
# Stop the application
# Now you get the negative values


was (Author: fly_in_gis):
Hi, Miklos Szegedi
I'm afraid this JIRA is not a duplicate of YARN-3933.
The primary cause of negative values is that metrics do not recover properly 
when NM restart.
*AllocatedContainers,ContainersLaunched,AllocatedGB,AvailableGB,AllocatedVCores,AvailableVCores*
 in metrics need to recover when NM restart.
This should be done in ContainerManagerImpl#recoverContainer.

The scenario could be reproduction by the following steps:
# Make sure 
YarnConfiguration.NM_RECOVERY_ENABLED=true,YarnConfiguration.NM_RECOVERY_SUPERVISED=true
 in NM
# Submit an application and keep running
# Restart NM
# Stop the application
# Now you get the negative values

> NodeManager metrics returning wrong negative values
> ---------------------------------------------------
>
>                 Key: YARN-6212
>                 URL: https://issues.apache.org/jira/browse/YARN-6212
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 2.7.3
>            Reporter: Abhishek Shivanna
>
> It looks like the metrics returned by the NodeManager have negative values 
> for metrics that never should be negative. Here is an output form NM endpoint 
> {noformat}
> /jmx?qry=Hadoop:service=NodeManager,name=NodeManagerMetrics
> {noformat}
> {noformat}
> {
>   "beans" : [ {
>     "name" : "Hadoop:service=NodeManager,name=NodeManagerMetrics",
>     "modelerType" : "NodeManagerMetrics",
>     "tag.Context" : "yarn",
>     "tag.Hostname" : "<HOST>",
>     "ContainersLaunched" : 707,
>     "ContainersCompleted" : 9,
>     "ContainersFailed" : 124,
>     "ContainersKilled" : 579,
>     "ContainersIniting" : 0,
>     "ContainersRunning" : 19,
>     "AllocatedGB" : -26,
>     "AllocatedContainers" : -5,
>     "AvailableGB" : 252,
>     "AllocatedVCores" : -5,
>     "AvailableVCores" : 101,
>     "ContainerLaunchDurationNumOps" : 718,
>     "ContainerLaunchDurationAvgTime" : 18.0
>   } ]
> }
> {noformat}
> Is there any circumstance under which the value for AllocatedGB, 
> AllocatedContainers and AllocatedVCores go below 0? 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to