Michael Ho created IMPALA-6907:
----------------------------------

             Summary: Update
                 Key: IMPALA-6907
                 URL: https://issues.apache.org/jira/browse/IMPALA-6907
             Project: IMPALA
          Issue Type: Bug
          Components: Distributed Exec
    Affects Versions: Impala 2.11.0, Impala 2.10.0, Impala 2.9.0, Impala 3.0, 
Impala 2.12.0
            Reporter: Michael Ho
            Assignee: Michael Ho


Currently, {{ImpalaServer::MembershipCallback()}} will remove stale connections 
to hosts which were removed from the cluster membership.

{noformat}
      while (loc_entry != query_locations_.end()) {
        if (current_membership.find(loc_entry->first) == 
current_membership.end()) {
          unordered_set<TUniqueId>::const_iterator query_id = 
loc_entry->second.begin();
          // Add failed backend locations to all queries that ran on that 
backend.
          for(; query_id != loc_entry->second.end(); ++query_id) {
            vector<TNetworkAddress>& failed_hosts = 
queries_to_cancel[*query_id];
            failed_hosts.push_back(loc_entry->first);
          }
          
exec_env_->impalad_client_cache()->CloseConnections(loc_entry->first); <<<-----
{noformat}

However, it's relies on checking against {{query_locations_}} which is 
populated only when the Impalad node acts as a coordinator. With the support 
for executor only configuration for Impalad nodes, {{query_locations_}} will be 
empty for executor only Impalad nodes so {{ImpalaServer::MembershipCallback()}} 
will not remove stale connections to hosts removed from cluster. This may cause 
stale connections to stay in connection cache for extended period of time, 
leading to query failure after the removed hosts rejoined the cluster as the 
stale connections are used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to