[
https://issues.apache.org/jira/browse/AMBARI-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Foley updated AMBARI-7284:
-------------------------------
Description:
Many /var/log/message alerts we keyed off of previously are no longer working
or valid. It appears that many hadoop 1.x terms such as jobtracker, tasktracker
and templeton still exist, when Ambari is being used with hadoop 2.x.
I believe existing rules need to be modified for the following service name
changes:
resourcemanager_process_down
resourcemanager_process_down_ok
resourcemanager_rpc_latency
resourcemanager_rpc_latency_ok
resourcemanager_cpu_utilization
resourcemanager_cpu_utilization_ok
nodemanagers_down
nodemanagers_down_ok
nodemanager_process_down
nodemanager_process_down_ok
webhcat_down
webhcat_down_ok
It also appears that existing messages are getting improperly matched as we see
the following HADOOP_UNKNOWN_MSG in /var/log/messages:
Jul 15 10:36:34 pitH1 nagios[35331]: Warning: Hadoop: HADOOP_UNKNOWN_MSG# Event
Host=pitH1.td.teradata.com Service Description=HDFS::Percent DataNodes with
space available(WARNING), WARNING: total:6, affected:1
was:
many /var/log/message alerts we keyed off of previously are no longer working
or valid. It appears that many hadoop 1.x terms such as jobtracker, tasktracker
and templeton still exist.
I believe existing rules need to be modified for the follow service name
changes:
resourcemanager_process_down
resourcemanager_process_down_ok
resourcemanager_rpc_latency
resourcemanager_rpc_latency_ok
resourcemanager_cpu_utilization
resourcemanager_cpu_utilization_ok
nodemanagers_down
nodemanagers_down_ok
nodemanager_process_down
nodemanager_process_down_ok
webhcat_down
webhcat_down_ok
It also appears that existing messages are getting improperly matched as we see
the following HADOOP_UNKNOWN_MSG in /var/log/messages:
Jul 15 10:36:34 pitH1 nagios[35331]: Warning: Hadoop: HADOOP_UNKNOWN_MSG# Event
Host=pitH1.td.teradata.com Service Description=HDFS::Percent DataNodes with
space available(WARNING), WARNING: total:6, affected:1
Environment: Hadoop branch-2, release 2.4 or 2.5.
> Hadoop cluster alerts have not been updated for Hadoop 2.4 and 2.5
> ------------------------------------------------------------------
>
> Key: AMBARI-7284
> URL: https://issues.apache.org/jira/browse/AMBARI-7284
> Project: Ambari
> Issue Type: Bug
> Affects Versions: 1.6.0
> Environment: Hadoop branch-2, release 2.4 or 2.5.
> Reporter: Nicholas Yao
>
> Many /var/log/message alerts we keyed off of previously are no longer working
> or valid. It appears that many hadoop 1.x terms such as jobtracker,
> tasktracker and templeton still exist, when Ambari is being used with hadoop
> 2.x.
> I believe existing rules need to be modified for the following service name
> changes:
> resourcemanager_process_down
> resourcemanager_process_down_ok
> resourcemanager_rpc_latency
> resourcemanager_rpc_latency_ok
> resourcemanager_cpu_utilization
> resourcemanager_cpu_utilization_ok
> nodemanagers_down
> nodemanagers_down_ok
> nodemanager_process_down
> nodemanager_process_down_ok
> webhcat_down
> webhcat_down_ok
> It also appears that existing messages are getting improperly matched as we
> see the following HADOOP_UNKNOWN_MSG in /var/log/messages:
> Jul 15 10:36:34 pitH1 nagios[35331]: Warning: Hadoop: HADOOP_UNKNOWN_MSG#
> Event Host=pitH1.td.teradata.com Service Description=HDFS::Percent DataNodes
> with space available(WARNING), WARNING: total:6, affected:1
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)