[ https://issues.apache.org/jira/browse/HBASE-19220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16245261#comment-16245261 ]
Duo Zhang commented on HBASE-19220: ----------------------------------- Is it a good idea to change a config in normal code path which only aims to make a UT pass? Anyway, I think we can commit this first to make our UTs more stable. And open another issue to tune the retry configs. Thanks. > Async tests time out talking to zk; 'clusterid came back null' > -------------------------------------------------------------- > > Key: HBASE-19220 > URL: https://issues.apache.org/jira/browse/HBASE-19220 > Project: HBase > Issue Type: Sub-task > Components: test > Reporter: stack > Assignee: stack > Fix For: 2.0.0-beta-1 > > Attachments: 19220.patch > > > I see this in test runs on a dedicated machine: > [ERROR] Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 652.514 s <<< FAILURE! - in > org.apache.hadoop.hbase.client.TestAsyncAdminBuilder > [ERROR] > testRpcTimeout[0](org.apache.hadoop.hbase.client.TestAsyncAdminBuilder) Time > elapsed: 213.618 s <<< ERROR! > java.util.concurrent.ExecutionException: java.io.IOException: clusterid came > back null > at > org.apache.hadoop.hbase.client.TestAsyncAdminBuilder.testRpcTimeout(TestAsyncAdminBuilder.java:105) > Caused by: java.io.IOException: clusterid came back null > [ERROR] org.apache.hadoop.hbase.client.TestAsyncTableScanMetrics Time > elapsed: 0.007 s <<< ERROR! > java.util.concurrent.ExecutionException: java.io.IOException: clusterid came > back null > at > org.apache.hadoop.hbase.client.TestAsyncTableScanMetrics.setUp(TestAsyncTableScanMetrics.java:97) > Caused by: java.io.IOException: clusterid came back null > [ERROR] org.apache.hadoop.hbase.client.TestRawAsyncScanCursor Time elapsed: > 0.005 s <<< ERROR! > java.util.concurrent.ExecutionException: java.io.IOException: clusterid came > back null > at > org.apache.hadoop.hbase.client.TestRawAsyncScanCursor.setUpBeforeClass(TestRawAsyncScanCursor.java:42) > Caused by: java.io.IOException: clusterid came back null > [ERROR] org.apache.hadoop.hbase.client.TestAsyncNamespaceAdminApi Time > elapsed: 0.005 s <<< ERROR! > java.util.concurrent.ExecutionException: java.io.IOException: clusterid came > back null > at > org.apache.hadoop.hbase.client.TestAsyncNamespaceAdminApi.setUpBeforeClass(TestAsyncNamespaceAdminApi.java:66) > Caused by: java.io.IOException: clusterid came back null > If I up the retries, they go away. > At least on this machine, I notice that zk connections can take a while... > see HBASE-19102 where we add a wait on the Connection to come up before > progressing. > Suggest that I up the retries. No harm in trying more. It is currently set to > 3 retries at a one second interval. -- This message was sent by Atlassian JIRA (v6.4.14#64029)