[jira] [Commented] (HBASE-23764) Flaky tests due to ZK client name resolution delays

Hudson (Jira) Sun, 02 Feb 2020 16:40:25 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-23764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17028607#comment-17028607
 ]


Hudson commented on HBASE-23764:
--------------------------------

Results for branch branch-2
        [build #2447 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2447/]: 
(x) *{color:red}-1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2447//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2447//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2447//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Flaky tests due to ZK client name resolution delays
> ---------------------------------------------------
>
>                 Key: HBASE-23764
>                 URL: https://issues.apache.org/jira/browse/HBASE-23764
>             Project: HBase
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 3.0.0
>            Reporter: Bharath Vissapragada
>            Assignee: Bharath Vissapragada
>            Priority: Major
>             Fix For: 3.0.0, 2.3.0
>
>         Attachments: sample-jstacks.zip
>
>
> [~ndimiduk] and I ran into this issue (separately) and we noticed that there 
> are some performance issues with name resolution in the Zookeeper client. 
> Since we use ZK heavily in the unit tests, this often manifests as the 
> following issues 
> 1. Test time outs starting the mini cluster (Master failed to start....)
> 2. InterruptedException (because the tests timeout)
> 3. Flaky tests because a subset of the cluster fails to start for whatever 
> reason (replication tests especially because they spawn multiple clusters).
> 4. ConnectionLoss to znode /hbase/xyzz.. JVM pause?
> I have strong feeling that this is a possible cause for many of our flaky 
> tests in Jenkins. Luckily, it looks like the following workaround to switch 
> to an IP address instead of hostname makes it much quicker. There are some 
> related discussions in the ZK community (ZOOKEEPER-1666 and related jiras).
> {code:java}
> --- a/hbase-common/src/main/resources/hbase-default.xml
> +++ b/hbase-common/src/main/resources/hbase-default.xml
> @@ -72,7 +72,7 @@ possible configurations would overwhelm and obscure the 
> important.
>    </property>
>    <property>
>      <name>hbase.zookeeper.quorum</name>
> -    <value>localhost</value>
> +    <value>127.0.0.1</value>
>      <description>Comma separated list of servers in the ZooKeeper ensemble
>      (This config. should have been named hbase.zookeeper.ensemble).
>      For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
> {code}
> Until we figure out the actual root cause and a dependency upgrade (if 
> needed), we should consider making this hostname to IP switch for more stable 
> builds.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HBASE-23764) Flaky tests due to ZK client name resolution delays

Reply via email to