[ 
https://issues.apache.org/jira/browse/HBASE-7989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596393#comment-13596393
 ] 

Jean-Daniel Cryans commented on HBASE-7989:
-------------------------------------------

This is something we saw yesterday I think.

First we saw tons of those a minute after the server died:

{noformat}
2013-03-07 01:27:57,065 WARN 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
Failed all from region=someregion, hostname=sv4r20s13, port=10304
java.util.concurrent.ExecutionException: java.net.SocketTimeoutException: Call 
to sv4r20s13/10.4.20.13:10304 failed on socket timeout exception: 
java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel 
to be ready for read. ch : java.nio.channels.SocketChannel[connected 
local=/10.4.17.37:46591 remote=sv4r20s13/10.4.20.13:10304]
        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1525)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1377)
        at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:702)
        at 
org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.parallelGet(ThriftServerRunner.java:1410)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:65)
        at $Proxy5.parallelGet(Unknown Source)
        at 
org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$parallelGet.getResult(Hbase.java:4930)
        at 
org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$parallelGet.getResult(Hbase.java:4918)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
        at 
org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:287)
        at org.apache.hadoop.hbase.thrift.CallQueue$Call.run(CallQueue.java:62)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.SocketTimeoutException: Call to sv4r20s13/10.4.20.13:10304 
failed on socket timeout exception: java.net.SocketTimeoutException: 60000 
millis timeout while waiting for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=/10.4.17.37:46591 
remote=sv4r20s13/10.4.20.13:10304]
        at 
org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:1052)
        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1025)
        at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
        at $Proxy6.multi(Unknown Source)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1354)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1352)
        at 
org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1361)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1349)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
        ... 3 more
Caused by: java.net.SocketTimeoutException: 60000 millis timeout while waiting 
for channel to be ready for read. ch : 
java.nio.channels.SocketChannel[connected local=/10.4.17.37:46591 
remote=sv4r20s13/10.4.20.13:10304]
        at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
        at java.io.FilterInputStream.read(FilterInputStream.java:116)
        at java.io.FilterInputStream.read(FilterInputStream.java:116)
        at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:399)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:672)
        at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:606)
{noformat}

Followed regular appearances by that 20 seconds timeout:

{noformat}
2013-03-07 01:32:17,205 WARN 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation: 
Failed all from region=someregion, hostname=sv4r20s13, port=10304
java.util.concurrent.ExecutionException: java.net.SocketTimeoutException: 20000 
millis timeout while waiting for channel to be ready for connect. ch : 
java.nio.channels.SocketChannel[connection-pending 
remote=sv4r20s13/10.4.20.13:10304]
        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1525)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1377)
        at org.apache.hadoop.hbase.client.HTable.batch(HTable.java:702)
        at 
org.apache.hadoop.hbase.thrift.ThriftServerRunner$HBaseHandler.parallelGet(ThriftServerRunner.java:1410)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.hbase.thrift.HbaseHandlerMetricsProxy.invoke(HbaseHandlerMetricsProxy.java:65)
        at $Proxy5.parallelGet(Unknown Source)
        at 
org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$parallelGet.getResult(Hbase.java:4930)
        at 
org.apache.hadoop.hbase.thrift.generated.Hbase$Processor$parallelGet.getResult(Hbase.java:4918)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
        at 
org.apache.hadoop.hbase.thrift.TBoundedThreadPoolServer$ClientConnnection.run(TBoundedThreadPoolServer.java:287)
        at org.apache.hadoop.hbase.thrift.CallQueue$Call.run(CallQueue.java:62)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.net.SocketTimeoutException: 20000 millis timeout while waiting 
for channel to be ready for connect. ch : 
java.nio.channels.SocketChannel[connection-pending 
remote=sv4r20s13/10.4.20.13:10304]
        at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:213)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:519)
        at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:484)
        at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupConnection(HBaseClient.java:416)
        at 
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.setupIOstreams(HBaseClient.java:462)
        at 
org.apache.hadoop.hbase.ipc.HBaseClient.getConnection(HBaseClient.java:1150)
        at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:1000)
        at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150)
        at $Proxy6.multi(Unknown Source)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1354)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1352)
        at 
org.apache.hadoop.hbase.client.ServerCallable.withoutRetries(ServerCallable.java:210)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1361)
        at 
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1349)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
        at java.util.concurrent.FutureTask.run(FutureTask.java:138)

{noformat}
                
> Client with a cache info on a dead server will wait for 20s before trying 
> another one.
> --------------------------------------------------------------------------------------
>
>                 Key: HBASE-7989
>                 URL: https://issues.apache.org/jira/browse/HBASE-7989
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 0.95.0, 0.98.0
>            Reporter: nkeywal
>
> Scenario is:
> - fetch the cache in the client
> - a server dies
> - try to use a region that is on the dead server
> This will lead to a 20 second connect timeout. We don't have this in unit 
> test because we have this only is the remote box does not answer. In the unit 
> tests we have immediately a connection refused from the OS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to