[ 
https://issues.apache.org/jira/browse/HBASE-27966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bryan Beaudreault resolved HBASE-27966.
---------------------------------------
    Resolution: Fixed

Pushed to branch-3. Test looks good there. I quickly checked the other branches 
and looks like they were properly backported.

> HBase Master/RS JVM metrics populated incorrectly
> -------------------------------------------------
>
>                 Key: HBASE-27966
>                 URL: https://issues.apache.org/jira/browse/HBASE-27966
>             Project: HBase
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 2.0.0-alpha-4
>            Reporter: Nihal Jain
>            Assignee: Nihal Jain
>            Priority: Major
>             Fix For: 2.6.0, 3.0.0-beta-1, 2.5.6
>
>         Attachments: test_patch.txt
>
>
> HBase Master/RS JVM metrics populated incorrectly due to regression causing 
> ambari metrics system to not able to capture them.
> Based on my analysis the issue is relevant for all release post 2.0.0-alpha-4 
> and seems to be caused due to HBASE-18846.
> Have been able to compare the JVM metrics across 3 versions of HBase and 
> attaching results of same below:
> HBase: 1.1.2
> {code:java}
> {
>     "name" : "Hadoop:service=HBase,name=JvmMetrics",
>     "modelerType" : "JvmMetrics",
>     "tag.Context" : "jvm",
>     "tag.ProcessName" : "RegionServer",
>     "tag.SessionId" : "",
>     "tag.Hostname" : "HOSTNAME",
>     "MemNonHeapUsedM" : 196.05664,
>     "MemNonHeapCommittedM" : 347.60547,
>     "MemNonHeapMaxM" : 4336.0,
>     "MemHeapUsedM" : 7207.315,
>     "MemHeapCommittedM" : 66080.0,
>     "MemHeapMaxM" : 66080.0,
>     "MemMaxM" : 66080.0,
>     "GcCount" : 3953,
>     "GcTimeMillis" : 662520,
>     "ThreadsNew" : 0,
>     "ThreadsRunnable" : 214,
>     "ThreadsBlocked" : 0,
>     "ThreadsWaiting" : 626,
>     "ThreadsTimedWaiting" : 78,
>     "ThreadsTerminated" : 0,
>     "LogFatal" : 0,
>     "LogError" : 0,
>     "LogWarn" : 0,
>     "LogInfo" : 0
>   },
> {code}
> HBase 2.0.2
> {code:java}
> {
>     "name" : "Hadoop:service=HBase,name=JvmMetrics",
>     "modelerType" : "JvmMetrics",
>     "tag.Context" : "jvm",
>     "tag.ProcessName" : "IO",
>     "tag.SessionId" : "",
>     "tag.Hostname" : "HOSTNAME",
>     "MemNonHeapUsedM" : 203.86688,
>     "MemNonHeapCommittedM" : 740.6953,
>     "MemNonHeapMaxM" : -1.0,
>     "MemHeapUsedM" : 14879.477,
>     "MemHeapCommittedM" : 31744.0,
>     "MemHeapMaxM" : 31744.0,
>     "MemMaxM" : 31744.0,
>     "GcCount" : 75922,
>     "GcTimeMillis" : 5134691,
>     "ThreadsNew" : 0,
>     "ThreadsRunnable" : 90,
>     "ThreadsBlocked" : 3,
>     "ThreadsWaiting" : 158,
>     "ThreadsTimedWaiting" : 36,
>     "ThreadsTerminated" : 0,
>     "LogFatal" : 0,
>     "LogError" : 0,
>     "LogWarn" : 0,
>     "LogInfo" : 0
>   },
> {code}
> HBase: 2.5.2
> {code:java}
> {
>       "name": "Hadoop:service=HBase,name=JvmMetrics",
>       "modelerType": "JvmMetrics",
>       "tag.Context": "jvm",
>       "tag.ProcessName": "IO",
>       "tag.SessionId": "",
>       "tag.Hostname": "HOSTNAME",
>       "MemNonHeapUsedM": 192.9798,
>       "MemNonHeapCommittedM": 198.4375,
>       "MemNonHeapMaxM": -1.0,
>       "MemHeapUsedM": 773.23584,
>       "MemHeapCommittedM": 1004.0,
>       "MemHeapMaxM": 1024.0,
>       "MemMaxM": 1024.0,
>       "GcCount": 2048,
>       "GcTimeMillis": 25440,
>       "ThreadsNew": 0,
>       "ThreadsRunnable": 22,
>       "ThreadsBlocked": 0,
>       "ThreadsWaiting": 121,
>       "ThreadsTimedWaiting": 49,
>       "ThreadsTerminated": 0,
>       "LogFatal": 0,
>       "LogError": 0,
>       "LogWarn": 0,
>       "LogInfo": 0
>  },
> {code}
> It can be observed that 2.0.x onwards the field "tag.ProcessName" is 
> populating as "IO" instead of expected "RegionServer" or "Master".
> Ambari relies on this field process name to create a metric 
> 'jvm.RegionServer.JvmMetrics.GcTimeMillis' etc. See 
> [code.|https://github.com/apache/ambari/blob/2ec4b055d99ec84c902da16dd57df91d571b48d6/ambari-server/src/main/java/org/apache/ambari/server/controller/metrics/timeline/AMSPropertyProvider.java#L722]
> But post 2.0.x the field is getting populated as 'IO' and hence a metric with 
> name 'jvm.JvmMetrics.GcTimeMillis' is created instead of expected 
> 'jvm.RegionServer.JvmMetrics.GcTimeMillis', thus mixing up the metric with 
> various other metrics coming from rs, master, spark executor etc. running on 
> same host.
> *Expected*
> Field "tag.ProcessName" should be populated as "RegionServer" or "Master" 
> instead of "IO".
> *Actual*
> Field "tag.ProcessName" is populating as "IO" instead of expected 
> "RegionServer" or "Master" causing incorrect metric being published by ambari 
> and thus mixing up all metrics and raising various alerts around JVM metrics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to