Anil Mahajan created AMBARI-14727:
-------------------------------------

             Summary: Cluster create looping on TopologyManager  
areHostGroupsResolved
                 Key: AMBARI-14727
                 URL: https://issues.apache.org/jira/browse/AMBARI-14727
             Project: Ambari
          Issue Type: Bug
          Components: ambari-server, blueprints
    Affects Versions: 2.1.2
         Environment: Ubuntu 12.04
HDP 2.3
Ambari 2.1.2

            Reporter: Anil Mahajan


Installing a cluster from a blueprint.  There are two host groups 
"server_group" and "agent_group".  When the cluster is installed, the server is 
the only host installed with the agent installing in a later step.

This worked fine until the "agent_group" host group was augmented with a 
"ZOOKEEPER_SERVER" instance (making a total of two zookeeper servers).

With this change, the installation stalls at 0 percent with no errors logged.  
A success log is repeated however, indicating that there is an unlogged 
critical failure.

The only similar issue I could find to this was AMBARI-10811.  Based on that, I 
have a feeling that the root cause here is that having two ZOOKEEPER_SERVER 
components activates some HA requirements.

The ambari-server log loops on this line:
INFO [pool-3-thread-1] TopologyManager:598 - 
TopologyManager.ConfigureClusterTask areHostGroupsResolved: host group name = 
server_group has been fully resolved, as all 1 required hosts are mapped to 1 
physical hosts.


Looking at the source for TopologyManager main loop, it appears as if " 
completed = areRequiredHostGroupsResolved(requiredHostGroups)" line is never 
getting a TRUE result.  However, the only logging from 
"areRequiredHostGroupsResolved" is the previously mentioned line, which 
indicates a TRUE result.

I think the failure case in the areRequiredHostGroupsResolved is being 
triggered without logging.  The logging for failure is wrapped in an IF 
condition without guaranteed logging:  

          if (groupInfo != null) {
            LOG.info("TopologyManager.ConfigureClusterTask 
areHostGroupsResolved: host group name = {} requires {} hosts to be mapped, but 
only {} are available.",
              groupInfo.getHostGroupName(), groupInfo.getRequestedHostCount(), 
groupInfo.getHostNames().size());
          }

There should be logging outside of the condition or in an ELSE segment.
  




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to