Dmitry Sen created AMBARI-2928:
----------------------------------

             Summary: Add a Nagios alert to check state of NN HA
                 Key: AMBARI-2928
                 URL: https://issues.apache.org/jira/browse/AMBARI-2928
             Project: Ambari
          Issue Type: Improvement
          Components: agent
    Affects Versions: 1.4.0
            Reporter: Dmitry Sen
            Assignee: Dmitry Sen
             Fix For: 1.4.0
         Attachments: AMBARI-2928.patch

Add Nagios alert

Title: "NameNode HA Healthy"

Check if one NN has tag.HAState = active and second NN has tag.HAState = 
standby.

Scenarios:
1.
Active + Standby NN are up 
OK: NameNode HA healthy true; Active<dev01.hortonworks.com>, 
Standby<dev02.hortonworks.com>, Unavailable<>
2.
Two Standby NNs are up
CRITICAL: No Active NN available; Active<>, 
Standby<dev01.hortonworks.com:dev02.hortonworks.com>, Unavailable<>
3.
Two Active NN are up
CRITICAL: No Active NN available; No failover NN available; 
Active<dev01.hortonworks.com:dev02.hortonworks.com>, Standby<>, Unavailable<>
4.
Both NN unavailable
CRITICAL: No Active NN available; No failover NN available: Active<>, 
Standby<>, Unavailable<dev01.hortonworks.com:dev02.hortonworks.com>
5.
Only one NameNode in cluster (no additional/standby NameNode configured)
CRITICAL: No failover NN available: Active<dev01.hortonworks.com>, Standby<>, 
Unavailable<>

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to