[
https://issues.apache.org/jira/browse/AMBARI-12532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Nettleton updated AMBARI-12532:
--------------------------------------
Description:
Deployment of a cluster using Blueprints will sometimes fail to properly update
all the configurations on a given cluster prior to attempting the install and
start phases for each component.
This probably occurs intermittently, and only tends to occur when creating
larger clusters, with 50 or more nodes.
This problem can cause a variety of failures, but the most common is that a
given configuration properly on a single host is not updated as expected by the
Blueprints processor. This can cause service startup failures, such as:
{code}
Exception in thread "main" java.lang.IllegalArgumentException: Does not contain
a valid host:port authority: %HOSTGROUP::host_group_1%:8020
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:197)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at
org.apache.hadoop.hdfs.DFSUtil.getAddressesForNameserviceId(DFSUtil.java:677)
at org.apache.hadoop.hdfs.DFSUtil.getAddressesForNsIds(DFSUtil.java:645)
at org.apache.hadoop.hdfs.DFSUtil.getAddresses(DFSUtil.java:628)
at org.apache.hadoop.hdfs.DFSUtil.getHaNnRpcAddresses(DFSUtil.java:727)
at org.apache.hadoop.hdfs.HAUtil.isHAEnabled(HAUtil.java:77)
at
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.create(DFSZKFailoverController.java:120)
at
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:177)
{code}
The Blueprints processor should wait for all required configuration types that
have been modified to move to the "TOPOLOGY_RESOLVED" state across the cluster
before attempting the first INSTALL task on the cluster.
I'm working on a fix for this, and will be submitting a patch shortly.
was:
Deployment of a cluster using Blueprints will sometimes fail to properly update
all the configurations on a given cluster prior to attempting the install and
start phases for each component.
This probably occurs intermittently, and only tends to occur when creating
larger clusters sizes (50 or more nodes).
This problem can cause a variety of failures, but the most common is that a
given configuration properly on a single host is not updated as expected by the
Blueprints processor. This can cause service startup failures, such as:
{code}
Exception in thread "main" java.lang.IllegalArgumentException: Does not contain
a valid host:port authority: %HOSTGROUP::host_group_1%:8020
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:197)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
at
org.apache.hadoop.hdfs.DFSUtil.getAddressesForNameserviceId(DFSUtil.java:677)
at org.apache.hadoop.hdfs.DFSUtil.getAddressesForNsIds(DFSUtil.java:645)
at org.apache.hadoop.hdfs.DFSUtil.getAddresses(DFSUtil.java:628)
at org.apache.hadoop.hdfs.DFSUtil.getHaNnRpcAddresses(DFSUtil.java:727)
at org.apache.hadoop.hdfs.HAUtil.isHAEnabled(HAUtil.java:77)
at
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.create(DFSZKFailoverController.java:120)
at
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:177)
{code}
The Blueprints processor should wait for all required configuration types that
have been modified to move to the "TOPOLOGY_RESOLVED" state across the cluster
before attempting the first INSTALL task on the cluster.
I'm working on a fix for this, and will be submitting a patch shortly.
> Blueprint deployment results in some configurations not being resolved
> properly
> -------------------------------------------------------------------------------
>
> Key: AMBARI-12532
> URL: https://issues.apache.org/jira/browse/AMBARI-12532
> Project: Ambari
> Issue Type: Bug
> Components: ambari-server
> Affects Versions: 2.1.0
> Reporter: Robert Nettleton
> Assignee: Robert Nettleton
> Priority: Critical
> Fix For: 2.1.1
>
>
> Deployment of a cluster using Blueprints will sometimes fail to properly
> update all the configurations on a given cluster prior to attempting the
> install and start phases for each component.
> This probably occurs intermittently, and only tends to occur when creating
> larger clusters, with 50 or more nodes.
> This problem can cause a variety of failures, but the most common is that a
> given configuration properly on a single host is not updated as expected by
> the Blueprints processor. This can cause service startup failures, such as:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Does not
> contain a valid host:port authority: %HOSTGROUP::host_group_1%:8020
> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:197)
> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
> at
> org.apache.hadoop.hdfs.DFSUtil.getAddressesForNameserviceId(DFSUtil.java:677)
> at org.apache.hadoop.hdfs.DFSUtil.getAddressesForNsIds(DFSUtil.java:645)
> at org.apache.hadoop.hdfs.DFSUtil.getAddresses(DFSUtil.java:628)
> at org.apache.hadoop.hdfs.DFSUtil.getHaNnRpcAddresses(DFSUtil.java:727)
> at org.apache.hadoop.hdfs.HAUtil.isHAEnabled(HAUtil.java:77)
> at
> org.apache.hadoop.hdfs.tools.DFSZKFailoverController.create(DFSZKFailoverController.java:120)
> at
> org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:177)
> {code}
> The Blueprints processor should wait for all required configuration types
> that have been modified to move to the "TOPOLOGY_RESOLVED" state across the
> cluster before attempting the first INSTALL task on the cluster.
> I'm working on a fix for this, and will be submitting a patch shortly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)