[ 
https://issues.apache.org/jira/browse/DRILL-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15736860#comment-15736860
 ] 

ASF GitHub Bot commented on DRILL-5098:
---------------------------------------

Github user sohami commented on a diff in the pull request:

    https://github.com/apache/drill/pull/679#discussion_r91821466
  
    --- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/client/ConnectTriesPropertyTestClusterBits.java
 ---
    @@ -0,0 +1,244 @@
    +/**
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + * <p>
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * <p>
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.drill.exec.client;
    +
    +import java.util.ArrayList;
    +import java.util.List;
    +import java.util.Properties;
    +import java.util.concurrent.ExecutionException;
    +
    +import org.apache.drill.common.config.DrillConfig;
    +import org.apache.drill.exec.ZookeeperHelper;
    +import org.apache.drill.exec.coord.ClusterCoordinator;
    +import org.apache.drill.exec.exception.DrillbitStartupException;
    +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
    +import org.apache.drill.exec.rpc.InvalidConnectionInfoException;
    +import org.apache.drill.exec.rpc.RpcException;
    +import org.apache.drill.exec.server.Drillbit;
    +
    +import org.apache.drill.exec.server.RemoteServiceSet;
    +
    +import org.junit.AfterClass;
    +import org.junit.BeforeClass;
    +import org.junit.Test;
    +
    +import static junit.framework.TestCase.assertTrue;
    +import static junit.framework.TestCase.fail;
    +
    +public class ConnectTriesPropertyTestClusterBits {
    +
    +  public static StringBuilder bitInfo;
    +  public static final String fakeBitsInfo = 
"127.0.0.1:5000,127.0.0.1:5001";
    +  public static List<Drillbit> drillbits;
    +  public static final int drillBitCount = 1;
    +  public static ZookeeperHelper zkHelper;
    +  public static RemoteServiceSet remoteServiceSet;
    +  public static DrillConfig drillConfig;
    +
    +  @BeforeClass
    +  public static void testSetUp() throws Exception {
    +    remoteServiceSet = RemoteServiceSet.getLocalServiceSet();
    +    zkHelper = new ZookeeperHelper();
    +    zkHelper.startZookeeper(1);
    +
    +    // Creating Drillbits
    +    drillConfig = zkHelper.getConfig();
    +    try {
    +      int drillBitStarted = 0;
    +      drillbits = new ArrayList<>();
    +      while(drillBitStarted < drillBitCount){
    +        drillbits.add(Drillbit.start(drillConfig, remoteServiceSet));
    +        ++drillBitStarted;
    +      }
    +    } catch (DrillbitStartupException e) {
    +      throw new RuntimeException("Failed to start drillbits.", e);
    +    }
    +    bitInfo = new StringBuilder();
    +
    +    for (int i = 0; i < drillBitCount; ++i) {
    +      final DrillbitEndpoint currentEndPoint = 
drillbits.get(i).getContext().getEndpoint();
    +      final String currentBitIp = currentEndPoint.getAddress();
    +      final int currentBitPort = currentEndPoint.getUserPort();
    +      bitInfo.append(",");
    +      bitInfo.append(currentBitIp);
    +      bitInfo.append(":");
    +      bitInfo.append(currentBitPort);
    +    }
    +  }
    +
    +  @AfterClass
    +  public static void testCleanUp(){
    +    for(int i=0; i < drillBitCount; ++i){
    +      drillbits.get(i).close();
    --- End diff --
    
    Fixed


> Improving fault tolerance for connection between client and foreman node.
> -------------------------------------------------------------------------
>
>                 Key: DRILL-5098
>                 URL: https://issues.apache.org/jira/browse/DRILL-5098
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Client - JDBC
>            Reporter: Sorabh Hamirwasia
>            Assignee: Sorabh Hamirwasia
>              Labels: doc-impacting, ready-to-commit
>             Fix For: 1.10
>
>
> With DRILL-5015 we allowed support for specifying multiple Drillbits in 
> connection string and randomly choosing one out of it. Over time some of the 
> Drillbits specified in the connection string may die and the client can fail 
> to connect to Foreman node if random selection happens to be of dead Drillbit.
> Even if ZooKeeper is used for selecting a random Drillbit from the registered 
> one there is a small window when client selects one Drillbit and then that 
> Drillbit went down. The client will fail to connect to this Drillbit and 
> error out. 
> Instead if we try multiple Drillbits (configurable tries count through 
> connection string) then the probability of hitting this error window will 
> reduce in both the cases improving fault tolerance. During further 
> investigation it was also found that if there is Authentication failure then 
> we throw that error as generic RpcException. We need to improve that as well 
> to capture this case explicitly since in case of Auth failure we don't want 
> to try multiple Drillbits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to