Anil Mahajan created AMBARI-14727:
-------------------------------------
Summary: Cluster create looping on TopologyManager
areHostGroupsResolved
Key: AMBARI-14727
URL: https://issues.apache.org/jira/browse/AMBARI-14727
Project: Ambari
Issue Type: Bug
Components: ambari-server, blueprints
Affects Versions: 2.1.2
Environment: Ubuntu 12.04
HDP 2.3
Ambari 2.1.2
Reporter: Anil Mahajan
Installing a cluster from a blueprint. There are two host groups
"server_group" and "agent_group". When the cluster is installed, the server is
the only host installed with the agent installing in a later step.
This worked fine until the "agent_group" host group was augmented with a
"ZOOKEEPER_SERVER" instance (making a total of two zookeeper servers).
With this change, the installation stalls at 0 percent with no errors logged.
A success log is repeated however, indicating that there is an unlogged
critical failure.
The only similar issue I could find to this was AMBARI-10811. Based on that, I
have a feeling that the root cause here is that having two ZOOKEEPER_SERVER
components activates some HA requirements.
The ambari-server log loops on this line:
INFO [pool-3-thread-1] TopologyManager:598 -
TopologyManager.ConfigureClusterTask areHostGroupsResolved: host group name =
server_group has been fully resolved, as all 1 required hosts are mapped to 1
physical hosts.
Looking at the source for TopologyManager main loop, it appears as if "
completed = areRequiredHostGroupsResolved(requiredHostGroups)" line is never
getting a TRUE result. However, the only logging from
"areRequiredHostGroupsResolved" is the previously mentioned line, which
indicates a TRUE result.
I think the failure case in the areRequiredHostGroupsResolved is being
triggered without logging. The logging for failure is wrapped in an IF
condition without guaranteed logging:
if (groupInfo != null) {
LOG.info("TopologyManager.ConfigureClusterTask
areHostGroupsResolved: host group name = {} requires {} hosts to be mapped, but
only {} are available.",
groupInfo.getHostGroupName(), groupInfo.getRequestedHostCount(),
groupInfo.getHostNames().size());
}
There should be logging outside of the condition or in an ELSE segment.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)