Zack Marsh created AMBARI-24531:
-----------------------------------
Summary: Persistent critical "NameNode High Availability Health"
alert after installing with 3 NameNodes
Key: AMBARI-24531
URL: https://issues.apache.org/jira/browse/AMBARI-24531
Project: Ambari
Issue Type: Bug
Components: alerts
Affects Versions: 2.7.0
Environment: sles12sp2
Reporter: Zack Marsh
After installing Hadoop with 3 NameNodes, there's a persistent alert in the
Ambari UI for the HDFS service:
{code:java}
NameNode High Availability Health:
Active['hdp2.labs.teradata.com:50070'], Standby['hdp1.labs.teradata.com:50070',
'hdp3.labs.teradata.com:50070'], Unknown[]
{code}
This appears to stem from the alert_ha_namenode_health.py script, in which the
NameNode topology is deemed unhealthy if there's not exactly 1 Standby NameNode.
Excerpt from the alert_ha_namenode_health.py script:
{code:java}
# there's only one scenario here; there is exactly 1 active and 1 standby
is_topology_healthy = len(active_namenodes) == 1 and len(standby_namenodes)
== 1
result_label = 'Active{0}, Standby{1},
Unknown{2}'.format(str(active_namenodes),
str(standby_namenodes), str(unknown_namenodes))
if is_topology_healthy:
# if there is exactly 1 active and 1 standby NN
return (RESULT_STATE_OK, [result_label])
else:
# other scenario
return (RESULT_STATE_CRITICAL, [result_label]){code}
Currently using the following workaround:
1. Replacing the following line in {{alert_ha_namenode_health.py}}:
{code:java}
is_topology_healthy = len(active_namenodes) == 1 and len(standby_namenodes) ==
1{code}
With:
{code:java}
is_topology_healthy = len(active_namenodes) == 1 and len(standby_namenodes) ==
len(nn_unique_ids)-1{code}
2. Restart Ambari Server
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)