[
https://issues.apache.org/jira/browse/AMBARI-9458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14310439#comment-14310439
]
Hudson commented on AMBARI-9458:
--------------------------------
FAILURE: Integrated in Ambari-trunk-Commit #1712 (See
[https://builds.apache.org/job/Ambari-trunk-Commit/1712/])
AMBARI-9458 - HDFS, YARN, and HBase Slave Health Alert Definitions (Yurii
Shylov via jonathanhurley) (jhurley:
http://git-wip-us.apache.org/repos/asf?p=ambari.git&a=commit&h=405b3762c5bde6a929f7b22732fa39b42bd24291)
*
ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanagers_summary.py
* ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/alerts.json
* ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/alerts.json
* ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/alerts.json
> HDFS, YARN, and HBase Slave Health Alert Definitions
> ----------------------------------------------------
>
> Key: AMBARI-9458
> URL: https://issues.apache.org/jira/browse/AMBARI-9458
> Project: Ambari
> Issue Type: Task
> Components: ambari-server
> Affects Versions: 2.0.0
> Reporter: Yurii Shylov
> Assignee: Yurii Shylov
> Fix For: 2.0.0
>
> Attachments: AMBARI-9458.patch
>
>
> When a slave component, such as a DataNode, encounters some catastrophic
> problem like a heap allocation error, and no longer can perform its work, the
> NameNode marks this DataNode as being unhealthy.
> The current alert definitions only check for the DataNode process being
> alive, which is still technically is. We need to add new alert definitions
> for:
> - HDFS/DataNode (runs on NameNode, query is to NameNode JMX)
> - YARN/NodeManager (runs on ResourceManager, query is to ResourceManager JMX)
> - HBase/RegionServer (runs on HBase Master, queries HBase Master JMX)
> Which will check for slaves that are in some sort of bad state. Depending on
> the JMX structures that need to be queried, these can either be METRIC or
> SCRIPT style alert definitions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)