[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15355703#comment-15355703
 ] 

Edward Ribeiro commented on ZOOKEEPER-2447:
-------------------------------------------

Hi [[email protected]], my concern is not the feature itself -- it is 
cool --, but the use of {{InetAddress#isReachable()}} because it seems to have 
had caused some problems in the past, that is, always returning false or true 
in spite of the real reachability, as seen below:

http://stackoverflow.com/questions/4779367/problem-with-isreachable-in-inetaddress-class

Of course, the SO threads that relate similar problems are usually old (five to 
ten years old!), so this could not be a problem anymore, but we must 
acknowledge this possible limitation. An alternative would be to roll out our 
own equivalent of {{InetAddress#isReachable()}} as I scribbled below? Wdyt?

{code}
    private boolean isReachable(InetAddress address, int port, int timeout) {
        if (timeout < 0)
           IllegalArgumentException("Timeout cannot be less than zero");

        Socket socket = new Socket();
        try {
            socket.connect(new InetSocketAddress(address, port), timeout);
        }
        catch (IOException e) {
            return false;
        }
        finally {
            try {
                socket.close();
            } catch (IOException e) {
                // Ignore any errors
            }
        }
        return true;
    }
}
{code}


> Zookeeper adds  good delay when one of the quorum host is not reachable
> -----------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-2447
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.4.6, 3.5.0
>            Reporter: Vishal Khandelwal
>            Assignee: Vishal Khandelwal
>             Fix For: 3.5.3, 3.6.0
>
>         Attachments: ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt
>
>
> StaticHostProvider --> resolveAndShuffle method adds all of the address which 
> are valid in the quorum to the list, shuffles them and sends back to client 
> connection class. If after shuffling if first node appear to be the one which 
> is not reachable, Clientcnx.SendThread.run will keep on connecting to the 
> failure till a timeout and the moves to a different node. This adds up random 
> delay in zookeeper connection in case a host is down. Rather we could check 
> if host is reachable in StaticHostProvider and ignore isReachable is false. 
> Same as we do for UnknownHostException Exception.
> This can tested using following test code by providing a valid host which is 
> not reachable. for quick test comment Collections.shuffle(tmpList, 
> sourceOfRandomness); in StaticHostProvider.resolveAndShuffle
> {code}
>  @Test
>   public void test() throws Exception {
>     EventsWatcher watcher = new EventsWatcher();
>     QuorumUtil qu = new QuorumUtil(1);
>     qu.startAll();
>     
>     ZooKeeper zk =
>         new ZooKeeper("<hostnamet:2181," + qu.getConnString(), 180 * 1000, 
> watcher);
>     
>     watcher.waitForConnected(CONNECTION_TIMEOUT * 5);
>     Assert.assertTrue("connection Established", watcher.isConnected());
>     zk.close();    
>   }
> {code}
> Following fix can be added to StaticHostProvider.resolveAndShuffle
> {code}
>  if(taddr.isReachable(4000 // can be some value)) {
>                       tmpList.add(new InetSocketAddress(taddr, 
> address.getPort()));
>                     } 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to