Andrew Onischuk created AMBARI-25604: ----------------------------------------
Summary: During blueprint deploy tasks sometimes fail due to KeyError on large clusters Key: AMBARI-25604 URL: https://issues.apache.org/jira/browse/AMBARI-25604 Project: Ambari Issue Type: Bug Reporter: Andrew Onischuk Assignee: Andrew Onischuk Fix For: 2.7.6 During blueprint deploy we don't rely on topology cache since [https://issues.apache.org/jira/browse/AMBARI-23660] So topology is send with command. BUT the problem occurs when we still try to generate it on agent and fail. {code:java} ERROR 2020-12-10 06:30:09,350 CustomServiceOrchestrator.py:459 - Caught an exception while executing custom service command: : 10; 10 Traceback (most recent call last): File "/usr/lib /ambari-agent/lib/ambari_agent/CustomServiceOrchestrator.py", line 324, in runCommand command = self.generate_command(command_header) File "/usr/lib /ambari-agent/lib/ambari_agent/CustomServiceOrchestrator.py", line 507, in generate_command command_dict = self.configuration_builder.get_configuration(cluster_id, service_name, component_name, required_config_timestamp) File "/usr/lib/ambari- agent/lib/ambari_agent/ConfigurationBuilder.py", line 43, in get_configuration 'clusterHostInfo': self.topology_cache.get_cluster_host_info(cluster_id), File "/usr/lib/ambari-agent/lib/ambari_agent/Utils.py", line 230, in newFunction return f(*args, **kw) File "/usr/lib/ambari- agent/lib/ambari_agent/ClusterTopologyCache.py", line 112, in get_cluster_host_info hostnames = [self.hosts_to_id[cluster_id][host_id].hostName for host_id in component_dict.hostIds] KeyError: 10{code} -- This message was sent by Atlassian Jira (v8.3.4#803005)