Nicholas Yao created AMBARI-7284:
------------------------------------

             Summary: Hadoop cluster alerts have not been updated for Hadoop 
2.4 and 2.5
                 Key: AMBARI-7284
                 URL: https://issues.apache.org/jira/browse/AMBARI-7284
             Project: Ambari
          Issue Type: Bug
    Affects Versions: 1.6.0
            Reporter: Nicholas Yao


many /var/log/message alerts we keyed off of previously are no longer working 
or valid. It appears that many hadoop 1.x terms such as jobtracker, tasktracker 
and templeton still exist.
I believe existing rules need to be modified for the follow service name 
changes:
resourcemanager_process_down
resourcemanager_process_down_ok
resourcemanager_rpc_latency
resourcemanager_rpc_latency_ok
resourcemanager_cpu_utilization
resourcemanager_cpu_utilization_ok
nodemanagers_down
nodemanagers_down_ok
nodemanager_process_down
nodemanager_process_down_ok
webhcat_down
webhcat_down_ok


It also appears that existing messages are getting improperly matched as we see 
the following HADOOP_UNKNOWN_MSG in /var/log/messages:
Jul 15 10:36:34 pitH1 nagios[35331]: Warning: Hadoop: HADOOP_UNKNOWN_MSG# Event 
Host=pitH1.td.teradata.com Service Description=HDFS::Percent DataNodes with 
space available(WARNING), WARNING: total:6, affected:1




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to