----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/30566/#review71013 -----------------------------------------------------------
ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/alerts.json <https://reviews.apache.org/r/30566/#comment116515> RegionServer(s) ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/alerts.json <https://reviews.apache.org/r/30566/#comment116516> Warning not needed since it has the same value as Critical. ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/alerts.json <https://reviews.apache.org/r/30566/#comment116518> RegionServer(s) ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/alerts.json <https://reviews.apache.org/r/30566/#comment116517> RegionServer(s) ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/alerts.json <https://reviews.apache.org/r/30566/#comment116519> datanode_health_summary DataNode Health Summary ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/alerts.json <https://reviews.apache.org/r/30566/#comment116520> There is a space in the https address ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/alerts.json <https://reviews.apache.org/r/30566/#comment116521> DataNode(s) ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/alerts.json <https://reviews.apache.org/r/30566/#comment116523> No need for Warning since Critical is the same value ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/alerts.json <https://reviews.apache.org/r/30566/#comment116524> DataNode(s) ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/alerts.json <https://reviews.apache.org/r/30566/#comment116525> nodemanager_health_summary NodeManager Health Summary ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanagers_summary.py <https://reviews.apache.org/r/30566/#comment116527> NodeManager ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanagers_summary.py <https://reviews.apache.org/r/30566/#comment116528> All NodeManagers are healthy - Jonathan Hurley On Feb. 3, 2015, 12:58 p.m., Yurii Shylov wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/30566/ > ----------------------------------------------------------- > > (Updated Feb. 3, 2015, 12:58 p.m.) > > > Review request for Ambari, Jonathan Robie and Srimanth Gunturi. > > > Bugs: AMBARI-9458 > https://issues.apache.org/jira/browse/AMBARI-9458 > > > Repository: ambari > > > Description > ------- > > When a slave component, such as a DataNode, encounters some catastrophic > problem like a heap allocation error, and no longer can perform its work, the > NameNode marks this DataNode as being unhealthy. > > The current alert definitions only check for the DataNode process being > alive, which is still technically is. We need to add new alert definitions > for: > > - HDFS/DataNode (runs on NameNode, query is to NameNode JMX) > - YARN/NodeManager (runs on ResourceManager, query is to ResourceManager JMX) > - HBase/RegionServer (runs on HBase Master, queries HBase Master JMX) > > Which will check for slaves that are in some sort of bad state. Depending on > the JMX structures that need to be queried, these can either be METRIC or > SCRIPT style alert definitions. > > > Diffs > ----- > > > ambari-server/src/main/resources/common-services/HBASE/0.96.0.2.0/alerts.json > fa911e1 > ambari-server/src/main/resources/common-services/HDFS/2.1.0.2.0/alerts.json > b8a20ac > ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/alerts.json > dc4fafd > > ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/alerts/alert_nodemanagers_summary.py > PRE-CREATION > > Diff: https://reviews.apache.org/r/30566/diff/ > > > Testing > ------- > > In progress > > > Thanks, > > Yurii Shylov > >
