[
https://issues.apache.org/jira/browse/HBASE-10670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13921058#comment-13921058
]
Ted Yu commented on HBASE-10670:
--------------------------------
Each doFsck() call in TestHBaseFsck simulates new invocation of hbck.
In production, new connection would be established to the cluster.
My patch reflects the above scenario.
Retrieving used connection from previous doFsck() call is susceptible to
variation of timeout parameters, thus prone to test failure.
> HBaseFsck#connect() should use new connection
> ---------------------------------------------
>
> Key: HBASE-10670
> URL: https://issues.apache.org/jira/browse/HBASE-10670
> Project: HBase
> Issue Type: Task
> Reporter: Ted Yu
> Assignee: Ted Yu
> Attachments: 10670-TestHBaseFsck.testCheckTableLocks.html,
> 10670-v1.txt
>
>
> When investigating TestHBaseFsck#testCheckTableLocks failure, I noticed the
> following:
> {code}
> 2014-03-03 04:26:04,981 WARN [Thread-1180]
> client.ConnectionManager$HConnectionImplementation(1998): Checking master
> connection
> com.google.protobuf.ServiceException: java.io.IOException: Call to
> c59-s15.cs1cloud.internal/172.18.145.15:52272 failed on local exception:
> org.apache.hadoop.hbase.ipc.RpcClient$CallTimeoutException: Call id=1282,
> waitTime=1, rpcTimeout=0
> at
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1699)
> at
> org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1740)
> at
> org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:40216)
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceState.isMasterRunning(ConnectionManager.java:1358)
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.isKeepAliveMasterConnectedAndRunning(ConnectionManager.java:1991)
> at
> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1710)
> at
> org.apache.hadoop.hbase.client.HBaseAdmin$MasterCallable.prepare(HBaseAdmin.java:3199)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:120)
> at
> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:97)
> at
> org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3226)
> at
> org.apache.hadoop.hbase.client.HBaseAdmin.getClusterStatus(HBaseAdmin.java:2158)
> at org.apache.hadoop.hbase.util.HBaseFsck.connect(HBaseFsck.java:308)
> at
> org.apache.hadoop.hbase.util.hbck.HbckTestingUtil.doFsck(HbckTestingUtil.java:52)
> at
> org.apache.hadoop.hbase.util.hbck.HbckTestingUtil.doFsck(HbckTestingUtil.java:43)
> at
> org.apache.hadoop.hbase.util.hbck.HbckTestingUtil.doFsck(HbckTestingUtil.java:38)
> at
> org.apache.hadoop.hbase.util.TestHBaseFsck.testCheckTableLocks(TestHBaseFsck.java:2100)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> Caused by: java.io.IOException: Call to
> c59-s15.cs1cloud.internal/172.18.145.15:52272 failed on local exception:
> org.apache.hadoop.hbase.ipc.RpcClient$CallTimeoutException: Call id=1282,
> waitTime=1, rpcTimeout=0
> at
> org.apache.hadoop.hbase.ipc.RpcClient.wrapException(RpcClient.java:1516)
> at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1486)
> at
> org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1684)
> ... 24 more
> Caused by: org.apache.hadoop.hbase.ipc.RpcClient$CallTimeoutException: Call
> id=1282, waitTime=1, rpcTimeout=0
> at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.cleanupCalls(RpcClient.java:1214)
> at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.cleanupCalls(RpcClient.java:1205)
> at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.close(RpcClient.java:1006)
> at
> org.apache.hadoop.hbase.ipc.RpcClient$Connection.run(RpcClient.java:749)
> {code}
> This ctor was used in HBaseFsck#connect():
> {code}
> public HBaseAdmin(Configuration c)
> throws MasterNotRunningException, ZooKeeperConnectionException, IOException
> {
> // Will not leak connections, as the new implementation of the constructor
> // does not throw exceptions anymore.
> this(ConnectionManager.getConnectionInternal(new Configuration(c)));
> {code}
> The connection retrieved would have been timed out by edge.incrementTime()
> call:
> {code}
> edge.incrementTime(conf.getLong(TableLockManager.TABLE_LOCK_EXPIRE_TIMEOUT,
> TableLockManager.DEFAULT_TABLE_LOCK_EXPIRE_TIMEOUT_MS)); // let table
> lock expire
> {code}
> New connection should be used in HBaseFsck#connect().
--
This message was sent by Atlassian JIRA
(v6.2#6252)