Robert Nettleton created AMBARI-12532:
-----------------------------------------
Summary: Blueprint deployment results in some configurations not
being resolved properly
Key: AMBARI-12532
URL: https://issues.apache.org/jira/browse/AMBARI-12532
Project: Ambari
Issue Type: Bug
Components: ambari-server
Affects Versions: 2.1.0
Reporter: Robert Nettleton
Assignee: Robert Nettleton
Priority: Critical
Fix For: 2.1.1
Deployment of a cluster using Blueprints will sometimes fail to properly update
all the configurations on a given cluster prior to attempting the install and
start phases for each component.
This probably occurs intermittently, and only tends to occur when creating
larger clusters sizes (50 or more nodes).
This problem can cause a variety of failures, but the most common is that a
given configuration properly on a single host is not updated as expected by the
Blueprints processor. This can cause service startup failures, such as:
{code}
Exception in thread "main" java.lang.IllegalArgumentException: Does not contain
a valid host:port authority: %HOSTGROUP::host_group_1%:8020
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:197)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at
org.apache.hadoop.hdfs.DFSUtil.getAddressesForNameserviceId(DFSUtil.java:677)
at org.apache.hadoop.hdfs.DFSUtil.getAddressesForNsIds(DFSUtil.java:645)
at org.apache.hadoop.hdfs.DFSUtil.getAddresses(DFSUtil.java:628)
at org.apache.hadoop.hdfs.DFSUtil.getHaNnRpcAddresses(DFSUtil.java:727)
at org.apache.hadoop.hdfs.HAUtil.isHAEnabled(HAUtil.java:77)
at
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.create(DFSZKFailoverController.java:120)
at
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:177)
{code}
The Blueprints processor should wait for all required configuration types that
have been modified to move to the "TOPOLOGY_RESOLVED" state across the cluster
before attempting the first INSTALL task on the cluster.
I'm working on a fix for this, and will be submitting a patch shortly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)