This means that the servers aren't responding in 60 seconds to the clients, I believe this is new from 0.90 so it could be that you were used to have long-running requests.
If not, check what's going on with those servers at the address given in the exception message. J-D On Thu, May 17, 2012 at 2:35 PM, Viral Bajaria <[email protected]> wrote: > Hello, > > I just upgraded our production cluster from hbase 0.89 (chd3b2, yeah!!) to > 0.92.1, ever since the upgrade I see a lot of issues with timeouts on my > clients. > > Below are the log dumps from the client and the regionserver that it was > requesting the data from. I can overcome this exception by increasing > hbase.rpc.timeout but I doubt that's the right way of solving this issue. > > Has anyone faced this issue in hbase 0.92, if yes how did you go about > solving it ? If not, any pointers on how to start debugging this ? > > Thanks, > Viral > > *Client Logs* > 2012-05-17T19:44:19.191Z Processed 277000, written 277000, writing row for > 40036669 (this log line is output for every 1000 rows) > 12/05/17 19:45:19 WARN client.HConnectionManager$HConnectionImplementation: > Failed all from > region=platform,40032999,1323868834966.e9f3d644fa843340129355bd9e005903., > hostname=elshadoop-c01, port=60020 > java.util.concurrent.ExecutionException: java.net.SocketTimeoutException: > Call to elshadoop-c01/10.16.80.69:60020 failed on socket timeout exception: > java.net.SocketTimeoutException: 60000 millis timeout while waiting for > channel to be ready for read. ch : > java.nio.channels.SocketChannel[connected > local=/10.16.80.30:57245remote=elshadoop-c01/ > 10.16.80.69:60020] > at > java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) > at java.util.concurrent.FutureTask.get(FutureTask.java:83) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1557) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1409) > at > org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:943) > at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:816) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:803) > at > com.x.aggregation.hbase.util.BulkTableWriter.commit(BulkTableWriter.java:191) > at > com.x.metrics.ingestmetadata.DataProvider.writePlatformToShowId(DataProvider.java:1799) > at com.x.metrics.ingestmetadata.Program$16.run(Program.java:163) > at com.x.metrics.ingestmetadata.Program.runAction(Program.java:24) > at com.x.metrics.ingestmetadata.Program.main(Program.java:218) > Caused by: java.net.SocketTimeoutException: Call to elshadoop-c01/ > 10.16.80.69:60020 failed on socket timeout exception: > java.net.SocketTimeoutException: 60000 millis timeout while waiting for > channel to be ready for read. ch : > java.nio.channels.SocketChannel[connected > local=/10.16.80.30:57245remote=elshadoop-c01/ > 10.16.80.69:60020] > at > org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:949) > at > org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:922) > at > org.apache.hadoop.hbase.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:150) > at $Proxy4.multi(Unknown Source) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1386) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3$1.call(HConnectionManager.java:1384) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithoutRetries(HConnectionManager.java:1365) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1383) > at > org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$3.call(HConnectionManager.java:1381) > at > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:619) > Caused by: java.net.SocketTimeoutException: 60000 millis timeout while > waiting for channel to be ready for read. ch : > java.nio.channels.SocketChannel[connected > local=/10.16.80.30:57245remote=elshadoop-c01/ > 10.16.80.69:60020] > at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) > at > org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) > at java.io.FilterInputStream.read(FilterInputStream.java:116) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection$PingInputStream.read(HBaseClient.java:311) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) > at java.io.BufferedInputStream.read(BufferedInputStream.java:237) > at java.io.DataInputStream.readInt(DataInputStream.java:370) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:571) > at > org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:505) > *RegionServer Logs* > 2012-05-17 19:45:36,287 WARN org.apache.hadoop.ipc.HBaseServer: > (responseTooSlow): > {"processingtimems":76888,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@255b7d7e), > rpc version=1, client version=29, methodsFingerPrint=54742778","client":" > 10.16.80.30:57245 > ","starttimems":1337283859381,"queuetimems":0,"class":"HRegionServer","responsesize":0,"method":"multi"} > 2012-05-17 19:45:36,289 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > Responder, call multi(org.apache.hadoop.hbase.client.MultiAction@255b7d7e), > rpc version=1, client version=29, methodsFingerPrint=54742778 from > 10.16.80.30:57245: output error > 2012-05-17 19:45:36,290 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server > handler 135 on 60020 caught: java.nio.channels.ClosedChannelException > at > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:133) > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelIO(HBaseServer.java:1710) > at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1653) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:924) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:1003) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Call.sendResponseIfReady(HBaseServer.java:409) > at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1346)
