[
https://issues.apache.org/jira/browse/HBASE-23764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17028607#comment-17028607
]
Hudson commented on HBASE-23764:
--------------------------------
Results for branch branch-2
[build #2447 on
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2447/]:
(x) *{color:red}-1 overall{color}*
----
details (if available):
(/) {color:green}+1 general checks{color}
-- For more information [see general
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2447//General_Nightly_Build_Report/]
(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2)
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2447//JDK8_Nightly_Build_Report_(Hadoop2)/]
(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3)
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2447//JDK8_Nightly_Build_Report_(Hadoop3)/]
(/) {color:green}+1 source release artifact{color}
-- See build output for details.
(/) {color:green}+1 client integration test{color}
> Flaky tests due to ZK client name resolution delays
> ---------------------------------------------------
>
> Key: HBASE-23764
> URL: https://issues.apache.org/jira/browse/HBASE-23764
> Project: HBase
> Issue Type: Bug
> Components: test
> Affects Versions: 3.0.0
> Reporter: Bharath Vissapragada
> Assignee: Bharath Vissapragada
> Priority: Major
> Fix For: 3.0.0, 2.3.0
>
> Attachments: sample-jstacks.zip
>
>
> [~ndimiduk] and I ran into this issue (separately) and we noticed that there
> are some performance issues with name resolution in the Zookeeper client.
> Since we use ZK heavily in the unit tests, this often manifests as the
> following issues
> 1. Test time outs starting the mini cluster (Master failed to start....)
> 2. InterruptedException (because the tests timeout)
> 3. Flaky tests because a subset of the cluster fails to start for whatever
> reason (replication tests especially because they spawn multiple clusters).
> 4. ConnectionLoss to znode /hbase/xyzz.. JVM pause?
> I have strong feeling that this is a possible cause for many of our flaky
> tests in Jenkins. Luckily, it looks like the following workaround to switch
> to an IP address instead of hostname makes it much quicker. There are some
> related discussions in the ZK community (ZOOKEEPER-1666 and related jiras).
> {code:java}
> --- a/hbase-common/src/main/resources/hbase-default.xml
> +++ b/hbase-common/src/main/resources/hbase-default.xml
> @@ -72,7 +72,7 @@ possible configurations would overwhelm and obscure the
> important.
> </property>
> <property>
> <name>hbase.zookeeper.quorum</name>
> - <value>localhost</value>
> + <value>127.0.0.1</value>
> <description>Comma separated list of servers in the ZooKeeper ensemble
> (This config. should have been named hbase.zookeeper.ensemble).
> For example, "host1.mydomain.com,host2.mydomain.com,host3.mydomain.com".
> {code}
> Until we figure out the actual root cause and a dependency upgrade (if
> needed), we should consider making this hostname to IP switch for more stable
> builds.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)