Repository: incubator-eagle
Updated Branches:
  refs/heads/master 327351b92 -> 2f4df34cf


[EAGLE-698] Collectd python plugin for gathering hadoop jmx information.

<!--
{% comment %}
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements.  See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to you under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License.  You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
{% endcomment %}
-->

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[EAGLE-<Jira issue #>] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
       Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
       number, if there is one.
 - [ ] If this contribution is large, please file an Apache
       [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---
…rmation.

- hadooproleconfig.json is a configuration file for hadoop roles to be collect.
- Manual_of_collectd_hadoop_plugin.md is a how-to document

- https://issues.apache.org/jira/browse/EAGLE-698

Author: joe-hj <joe.h...@gmail.com>

Closes #584 from joe-hj/EAGLE-698.


Project: http://git-wip-us.apache.org/repos/asf/incubator-eagle/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-eagle/commit/2f4df34c
Tree: http://git-wip-us.apache.org/repos/asf/incubator-eagle/tree/2f4df34c
Diff: http://git-wip-us.apache.org/repos/asf/incubator-eagle/diff/2f4df34c

Branch: refs/heads/master
Commit: 2f4df34cfbcce2ab0cd72f52c45a7873ec689722
Parents: 327351b
Author: joe-hj <joe.h...@gmail.com>
Authored: Wed Nov 9 17:01:22 2016 +0800
Committer: Hao Chen <h...@apache.org>
Committed: Wed Nov 9 17:01:22 2016 +0800

----------------------------------------------------------------------
 eagle-external/hadoop_jmx_collectd/README.md    |  63 +++++++
 eagle-external/hadoop_jmx_collectd/hadoop.py    | 167 +++++++++++++++++++
 .../hadoop_jmx_collectd/hadooproleconfig.json   |  57 +++++++
 3 files changed, 287 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-eagle/blob/2f4df34c/eagle-external/hadoop_jmx_collectd/README.md
----------------------------------------------------------------------
diff --git a/eagle-external/hadoop_jmx_collectd/README.md 
b/eagle-external/hadoop_jmx_collectd/README.md
new file mode 100755
index 0000000..beab2c9
--- /dev/null
+++ b/eagle-external/hadoop_jmx_collectd/README.md
@@ -0,0 +1,63 @@
+#**A manual for hadoop plugin of collectd**
+
+Collectd website referto [collectd.org](collectd.org)
+
+###Description
+The plugin collect information from http://hostname:port/jmx , according to 
the hadoop role configuration, it support roles as below:
+
+       HDFS NameNode
+       HDFS DataNode
+       HDFS JournalNode
+       HBase Master
+       Hbase RegionServer
+       Yarn NodeManager
+       Yarn ResourceManager
+
+###Install
+>1)
+Deploy hadoop.py in your collectd plugins path , in my environment, it is like 
: /opt/collectd/lib/collectd/plugins/hadoop.py(It assume that you install 
collectd in the directory of /opt/collectd), maybe you should create the 
plugins directory ahead.
+
+>2)
+Configure the python plugin in file collectd.conf , and you should trun on the 
python switch first. 
+
+>A snippet of collectd.conf for showing hadoop python plugin 
configuration:<br/>
+>
+    LoadPlugin python
+       <Plugin "python">                                                       
                                                                                
                                                                                
    
+       ModulePath "/opt/collectd/lib/collectd/plugins/"
+    LogTraces true 
+    Import "hadoop"
+    <Module "hadoop">
+        HDFSDatanodeHost "YourHostName"
+        Port "50075"
+        Verbose true 
+        Instance "192.168.xxx.xxx" 
+        JsonPath "/xxx/xxx/hadooproleconfig.json"
+    </Module>
+    <Module "hadoop">
+        YARNResourceManager "YourHostName"
+        Port "8088"
+        Verbose true 
+        Instance "192.168.xxx.xxx" 
+    </Module>
+    </Plugin>
+
+>**Notification**:
+>Instance , port and host(role) fileds must be set.
+
+>3)Two way to cite hadooproleconfig.json , either is ok.<br/>
+>>a)Place hadooproleconfig.json path in one <Module "hadoop">  </Module> pair 
in collectd.conf.<br/>
+>>b)Place hadooproleconfig.json file in you BaseDir which defined in 
collectd.conf.
+
+>4)
+If you update your collectd.conf or hadooproleconfig.json, you should restart 
your collectd application.
+
+###Dependency
+       collectd:       I test in version 5.6.0.
+       Hadoop:         I test in version 2.6.0-cdh5.4.3.
+       Hbase:          I test in version 1.0.0-chd5.4.3.
+
+###Testing
+>You can use hadoop.py as a single independent python file for debugging jmx, 
and you can also use it as plugin for collectd, a variable MyDebug is used as a 
swtich when used in the two different environments.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-eagle/blob/2f4df34c/eagle-external/hadoop_jmx_collectd/hadoop.py
----------------------------------------------------------------------
diff --git a/eagle-external/hadoop_jmx_collectd/hadoop.py 
b/eagle-external/hadoop_jmx_collectd/hadoop.py
new file mode 100755
index 0000000..8684e66
--- /dev/null
+++ b/eagle-external/hadoop_jmx_collectd/hadoop.py
@@ -0,0 +1,167 @@
+#! /usr/bin/python
+
+import urllib2
+import json
+import os
+
+CurRoleTypeInfo= {}
+HadoopConfigs= []
+LogSwitch= False
+JsonPath = None
+
+def CfgCallback(conf):
+    global JsonPath
+    CurRoleType = None
+    Role = None
+    Port = None
+    Host = None
+    OutputFlag = LogSwitch
+
+    for EachRoleConfig in conf.children:
+        collectd.info('hadoop pluginx : %s' % EachRoleConfig.key)
+        if EachRoleConfig.key == 'HDFSNamenodeConfigHost':
+            Host = EachRoleConfig.values[0]
+            CurRoleType = "ROLE_TYPE_NAMENODE"
+        elif EachRoleConfig.key == 'HDFSDatanodeConfigHost':
+            Host = EachRoleConfig.values[0]
+            CurRoleType = "ROLE_TYPE_DATANODE"
+        elif EachRoleConfig.key == 'YarnNodeManagerHost':
+            Host = EachRoleConfig.values[0]
+            CurRoleType = "ROLE_TYPE_NODEMANAGER"
+        elif EachRoleConfig.key == 'YarnResourceManagerHost':
+            Host = EachRoleConfig.values[0]
+            CurRoleType ="ROLE_TYPE_RESOURCEMANAGER"
+        elif EachRoleConfig.key == 'HbaseMasterHost':
+            Host = EachRoleConfig.values[0]
+            CurRoleType = "ROLE_TYPE_HBASE_MASTER"
+        elif EachRoleConfig.key == 'HbaseRegionserverHost':
+            Host = EachRoleConfig.values[0]
+            CurRoleType = "ROLE_TYPE_HBASE_REGIONSERVER"
+        elif EachRoleConfig.key == 'HDFSJournalEachRoleConfigHost':
+            Host = EachRoleConfig.values[0]
+            CurRoleType = "ROLE_TYPE_HDFS_JOURNALNODE"
+        elif EachRoleConfig.key == 'Port':
+            Port = EachRoleConfig.values[0]
+        elif EachRoleConfig.key == 'Instance':
+            Role = EachRoleConfig.values[0]
+        elif EachRoleConfig.key == 'Verbose':
+            OutputFlag = bool(EachRoleConfig.values[0])
+        elif EachRoleConfig.key == 'JsonPath':
+            collectd.info('hadoop plugin cfg: %s' % EachRoleConfig.values[0])
+            JsonPath = EachRoleConfig.values[0]
+        else:
+            collectd.warning('hadoop plugin: Unsupported key: %s.' % 
EachRoleConfig.key)
+        #collectd.info('hadoop plugin cfg: %s.' % JsonPath)
+
+    if not Host or not Role or not CurRoleType or not Port:
+        collectd.error('hadoop plugin error: *Host, Port, and CurRoleType 
should not be empty.')
+    else:
+        CurrentConfigMap = {
+            'RoleInstance': Role,
+            'port': Port,
+            'host': Host,
+            'RoleType': CurRoleType ,
+            'OutputFlag': OutputFlag 
+        }
+
+        HadoopConfigs.append(CurrentConfigMap)
+
+def GetdataCallback():
+    GetJsonConfig() 
+    for EachConfig in HadoopConfigs:
+        Host = EachConfig['host'] 
+        Port = EachConfig['port'] 
+        RoleInstance = EachConfig['RoleInstance']
+        RoleType = EachConfig['RoleType'] 
+        OutputFlag = EachConfig['OutputFlag'] 
+        
+        if isinstance(Port,int) == False:    
+            MyLog("Host Port is not number error",True)
+
+        JmxUrl = "http://"; + Host + ":" + Port.__str__() + "/jmx"
+
+        try:
+            Contents = json.load(urllib2.urlopen(JmxUrl, timeout=5))
+
+        except urllib2.URLError as e:
+            if MyDebug == 1:
+                print(JmxUrl,e)
+            else:
+                collectd.error('hadoop plugin: can not connect to %s - %r' % 
(JmxUrl, e))
+
+        if MyDebug == 1:
+            print(RoleType)
+        else:
+            collectd.info('hadoop pluginx [testheju]: %s' % RoleType)
+
+        for RoleInfo in Contents["beans"]:
+            for RoleKey, RoleValue in CurRoleTypeInfo[RoleType].iteritems():
+                if RoleInfo['name'].startswith(RoleValue):
+                    for k, v in RoleInfo.iteritems():
+# Due to the limite of dispatch interface in collectd, the appropriate type is 
int and float  
+                        if isinstance(v, int) or isinstance(v, float):
+# gauge is type defined in collectd
+                            Submit2Collectd('gauge', '.'.join((RoleKey, k)), 
v, RoleInstance, RoleType, OutputFlag)
+
+def Submit2Collectd(type, name, value, instance, instance_type, OutputFlag):
+    if value is None:
+        #value = ''
+        collectd.warning('hadoop pluginx : value is None of the key %s' % name)
+    else:
+        plugin_instance = '.'.join((instance, instance_type))
+        MyLog('%s [%s]: %s=%s' % (plugin_instance, type, name, value), 
OutputFlag)
+
+        if MyDebug == 0:
+            SendValue = collectd.Values(plugin='hadoop')
+            SendValue.type = type
+            SendValue.type_instance = name
+            SendValue.values = [value]
+            SendValue.plugin_instance = plugin_instance
+            SendValue.meta = {'0': True}
+            SendValue.dispatch()
+
+def MyLog(msg, OutputFlag):
+    if OutputFlag:
+        if MyDebug == 1:
+            print(msg)
+            pass
+        else:
+            collectd.info('hadoop pluginx output: %s' % msg)
+
+def GetJsonConfig():
+    global JsonPath
+    #JsonPath = "/opt/collectd/lib/collectd/plugins/hadooproleconfig.json"
+    MyLog("pwd:%s" % os.getcwd(),True)
+    if JsonPath == None or JsonPath == "":
+        with open("./hadooproleconfig.json",'r') as f:
+            data = json.load(f) 
+            for k,v in data.iteritems():
+                if k.startswith("ROLE_TYPE"):
+                    CurRoleTypeInfo[k]=v 
+    else:
+        with open(JsonPath,'r') as f:
+            data = json.load(f) 
+            for k,v in data.iteritems():
+                if k.startswith("ROLE_TYPE"):
+                    CurRoleTypeInfo[k]=v 
+
+if __name__ == "__main__":
+    MyDebug = 1
+    
+# You can config you Role configuration like example below
+    ConfigMap = {
+            'RoleInstance': "instance",
+            'port': 8042,
+            'host': "lujian",
+            'RoleType': "ROLE_TYPE_NODEMANAGER",
+            'OutputFlag': bool(True)
+        }
+
+    HadoopConfigs.append(ConfigMap)
+    #print HadoopConfigs
+    GetdataCallback()
+else:
+    import collectd
+    MyDebug = 0
+    collectd.register_config(CfgCallback)
+    collectd.register_read(GetdataCallback)

http://git-wip-us.apache.org/repos/asf/incubator-eagle/blob/2f4df34c/eagle-external/hadoop_jmx_collectd/hadooproleconfig.json
----------------------------------------------------------------------
diff --git a/eagle-external/hadoop_jmx_collectd/hadooproleconfig.json 
b/eagle-external/hadoop_jmx_collectd/hadooproleconfig.json
new file mode 100644
index 0000000..59bc8e1
--- /dev/null
+++ b/eagle-external/hadoop_jmx_collectd/hadooproleconfig.json
@@ -0,0 +1,57 @@
+{
+       "ROLE_TYPE_NAMENODE": {
+               "FSNamesystem": 
"Hadoop:service=NameNode,name=FSNamesystemState",
+                       "GCParNew": 
"java.lang:type=GarbageCollector,name=ParNew",
+                       "GCCMSMarkSweep": 
"java.lang:type=GarbageCollector,name=ConcurrentMarkSweep",
+                       "Threading": "java.lang:type=Threading",
+                       "JvmMetrics": "Hadoop:service=NameNode,name=JvmMetrics"
+       },
+               "ROLE_TYPE_DATANODE": {
+                       "DatanodeActivity": 
"Hadoop:service=DataNode,name=DataNodeActivity",
+                       "GCParNew": 
"java.lang:type=GarbageCollector,name=ParNew",
+                       "GCCMSMarkSweep": 
"java.lang:type=GarbageCollector,name=ConcurrentMarkSweep",
+                       "Threading": "java.lang:type=Threading",
+                       "JvmMetrics": "Hadoop:service=DataNode,name=JvmMetrics"
+               },
+               "ROLE_TYPE_HDFS_JOURNALNODE": {
+                       "GCPSScavenge": 
"java.lang:type=GarbageCollector,name=PS Scavenge",
+                       "GCPSMarkSweep": 
"java.lang:type=GarbageCollector,name=PS MarkSweep",
+                       "Threading": "java.lang:type=Threading",
+                       "JvmMetrics": 
"Hadoop:service=JournalNode,name=JvmMetrics"
+               },
+               "ROLE_TYPE_HBASE_MASTER": {
+                       "MasterBalancer": 
"Hadoop:service=HBase,name=Master,sub=Balancer",
+                       "MasterAssignmentManager": 
"Hadoop:service=HBase,name=Master,sub=AssignmentManger",
+                       "GCParNew": 
"java.lang:type=GarbageCollector,name=ParNew",
+                       "GCCMSMarkSweep": 
"java.lang:type=GarbageCollector,name=ConcurrentMarkSweep",
+                       "MasterServer": 
"Hadoop:service=HBase,name=Master,sub=Server",
+                       "Threading": "java.lang:type=Threading",
+                       "JvmMetrics": "Hadoop:service=HBase,name=JvmMetrics"
+               },
+               "ROLE_TYPE_HBASE_REGIONSERVER": {
+                       "Regions": 
"Hadoop:service=HBase,name=RegionServer,sub=Regions",
+                       "Replication": 
"Hadoop:service=HBase,name=RegionServer,sub=Replication",
+                       "WAL": "Hadoop:service=HBase,name=RegionServer,sub=WAL",
+                       "Server": 
"Hadoop:service=HBase,name=RegionServer,sub=Server",
+                       "GCParNew": 
"java.lang:type=GarbageCollector,name=ParNew",
+                       "GCCMSMarkSweep": 
"java.lang:type=GarbageCollector,name=ConcurrentMarkSweep",
+                       "Threading": "java.lang:type=Threading",
+                       "JvmMetrics": "Hadoop:service=HBase,name=JvmMetrics"
+               },
+               "ROLE_TYPE_NODEMANAGER": {
+                       "GCParNew": 
"java.lang:type=GarbageCollector,name=ParNew",
+                       "GCCMSMarkSweep": 
"java.lang:type=GarbageCollector,name=ConcurrentMarkSweep",
+                       
"NodeManagerMetrics":"Hadoop:service=NodeManager,name=NodeManagerMetrics",
+                       
"NodeManagerShuffleMetrics":"Hadoop:service=NodeManager,name=ShuffleMetrics",
+                       "Threading": "java.lang:type=Threading",
+                       "JvmMetrics": 
"Hadoop:service=NodeManager,name=JvmMetrics"
+               },
+               "ROLE_TYPE_RESOURCEMANAGER": {
+                       "GCParNew": 
"java.lang:type=GarbageCollector,name=ParNew",
+                       "GCCMSMarkSweep": 
"java.lang:type=GarbageCollector,name=ConcurrentMarkSweep",
+                       
"UgiMetrics":"Hadoop:service=ResourceManager,name=UgiMetrics",
+                       
"ClusterMetrics":"Hadoop:service=ResourceManager,name=ClusterMetrics",
+                       "Threading": "java.lang:type=Threading",
+                       "JvmMetrics": 
"Hadoop:service=ResourceManager,name=JvmMetrics"
+               }
+}

Reply via email to