Nicholas Yao created AMBARI-7284:
------------------------------------
Summary: Hadoop cluster alerts have not been updated for Hadoop
2.4 and 2.5
Key: AMBARI-7284
URL: https://issues.apache.org/jira/browse/AMBARI-7284
Project: Ambari
Issue Type: Bug
Affects Versions: 1.6.0
Reporter: Nicholas Yao
many /var/log/message alerts we keyed off of previously are no longer working
or valid. It appears that many hadoop 1.x terms such as jobtracker, tasktracker
and templeton still exist.
I believe existing rules need to be modified for the follow service name
changes:
resourcemanager_process_down
resourcemanager_process_down_ok
resourcemanager_rpc_latency
resourcemanager_rpc_latency_ok
resourcemanager_cpu_utilization
resourcemanager_cpu_utilization_ok
nodemanagers_down
nodemanagers_down_ok
nodemanager_process_down
nodemanager_process_down_ok
webhcat_down
webhcat_down_ok
It also appears that existing messages are getting improperly matched as we see
the following HADOOP_UNKNOWN_MSG in /var/log/messages:
Jul 15 10:36:34 pitH1 nagios[35331]: Warning: Hadoop: HADOOP_UNKNOWN_MSG# Event
Host=pitH1.td.teradata.com Service Description=HDFS::Percent DataNodes with
space available(WARNING), WARNING: total:6, affected:1
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)