Robert Nettleton created AMBARI-12532:
-----------------------------------------

             Summary: Blueprint deployment results in some configurations not 
being resolved properly
                 Key: AMBARI-12532
                 URL: https://issues.apache.org/jira/browse/AMBARI-12532
             Project: Ambari
          Issue Type: Bug
          Components: ambari-server
    Affects Versions: 2.1.0
            Reporter: Robert Nettleton
            Assignee: Robert Nettleton
            Priority: Critical
             Fix For: 2.1.1


Deployment of a cluster using Blueprints will sometimes fail to properly update 
all the configurations on a given cluster prior to attempting the install and 
start phases for each component.

This probably occurs intermittently, and only tends to occur when creating 
larger clusters sizes (50 or more nodes).  

This problem can cause a variety of failures, but the most common is that a 
given configuration properly on a single host is not updated as expected by the 
Blueprints processor.  This can cause service startup failures, such as:

{code}
Exception in thread "main" java.lang.IllegalArgumentException: Does not contain 
a valid host:port authority: %HOSTGROUP::host_group_1%:8020
        at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:197)
        at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
        at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
        at 
org.apache.hadoop.hdfs.DFSUtil.getAddressesForNameserviceId(DFSUtil.java:677)
        at org.apache.hadoop.hdfs.DFSUtil.getAddressesForNsIds(DFSUtil.java:645)
        at org.apache.hadoop.hdfs.DFSUtil.getAddresses(DFSUtil.java:628)
        at org.apache.hadoop.hdfs.DFSUtil.getHaNnRpcAddresses(DFSUtil.java:727)
        at org.apache.hadoop.hdfs.HAUtil.isHAEnabled(HAUtil.java:77)
        at 
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.create(DFSZKFailoverController.java:120)
        at 
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:177)
{code}

The Blueprints processor should wait for all required configuration types that 
have been modified to move to the "TOPOLOGY_RESOLVED" state across the cluster 
before attempting the first INSTALL task on the cluster. 

I'm working on a fix for this, and will be submitting a patch shortly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to