Jonathan Hurley created AMBARI-23478:
----------------------------------------

             Summary: YARN Cluster CPU Usage Graph Always Shows High CPU Usage
                 Key: AMBARI-23478
                 URL: https://issues.apache.org/jira/browse/AMBARI-23478
             Project: Ambari
          Issue Type: Bug
    Affects Versions: 2.5.0
            Reporter: Jonathan Hurley
             Fix For: 2.7.0


h3. ISSUE
In Ambari, YARN's Cluster CPU widget always shows relatively high CPU usage, 
when NodeManager in a cluster is more than one.
 !image-2018-03-19-20-26-44-325.png|thumbnail! 
 !image-2018-03-19-20-27-19-160.png|thumbnail! 
(started another node at around 19:00)

h3. REPRODUCE STEPS
# Install a cluster with one NodeManager and AMS.
# Confirm "Cluster CPU" widget looks OK
# Add one more node with NodeManager, and wait for a while

h3. INVESTIGATION
AMS side looks OK
{code}
curl -s -k http://sandbox-hdp.hortonworks.com:6188/ws/v1/timeline/metrics -G 
--data-urlencode metricNames=cpu_idle._sum --data-urlencode appId=NODEMANAGER 
--data-urlencode startTime=1521454794 --data-urlencode endTime=1521455394 
--data-urlencode precision=MINUTES 
...
{
    "metrics": [
        {
            "appid": "nodemanager",
            "metadata": {},
            "metricname": "cpu_idle._sum",
            "metrics": {
                "1521454800000": 198.99000000000001,
                "1521455100000": 192.56999999999999
            },
            "starttime": 1521454800000,
            "timestamp": 1521454800000
        }
    ]
}
{code}

But via Ambari, cpu_idle._sum becomes *{color:#d04437}100 times{color}* smaller
{code}
curl -s -k -u admin:admin 
http://sandbox-hdp.hortonworks.com:8080/api/v1/clusters/Sandbox/services/YARN/components/NODEMANAGER
 -G --data-urlencode 
'fields=metrics/cpu/cpu_idle._sum[1521454950,1521455550,15]'
...(snip)...
  "metrics" : {
    "cpu" : {
      "cpu_idle._sum" : [
        [
          1.8686666666666667,
          1521454950
        ],
        [
          1.9843333333333333,
          1521454980
        ],
        [
          1.9,
          1521455010
        ],
        [
          1.9846666666666664,
          1521455040
        ],
        [
          1.8926666666666665,
          1521455070
        ],
...(snip)...
{code}

Somehow 'cpu_idle._sum' is always wrong for this Widget:

{code}
curl -s -k -u admin:admin 
http://sandbox-hdp.hortonworks.com:8080/api/v1/clusters/Sandbox/services/YARN/components/NODEMANAGER
 -G --data-urlencode 
'fields=metrics/cpu/cpu_nice._sum[1521196167,1521199767,15],metrics/cpu/cpu_idle._avg[1521196167,1521199767,15],metrics/cpu/cpu_wio._sum[1521196167,1521199767,15],metrics/cpu/cpu_idle._sum[1521196167,1521199767,15],metrics/cpu/cpu_user._sum[1521196167,1521199767,15],metrics/cpu/cpu_system._sum[1521196167,1521199767,15]'
 -o ./ambari_NODEMANAGER_metrics.json

[root@sandbox-hdp ~]# grep -E -B1 '"cpu_|1521450000' 
ambari_NODEMANAGER_metrics.json | grep -vE -- '(--|\],)'
    "cpu" : {
      "cpu_idle._avg" : [
          85.54999999999998,
          1521199500
      "cpu_idle._sum" : [
          1.7109999999999996,     <<< need to multiply 100
          1521199500
      "cpu_nice._sum" : [
          0.0,
          1521199500
      "cpu_system._sum" : [
          21.900000000000002,
          1521199500
      "cpu_user._sum" : [
          6.666666666666666,
          1521199500
      "cpu_wio._sum" : [
          0.2,
          1521199500
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to