I've implemented my own coprocessor client, protocol and implementation that returns back to the user a List of KeyValues with values that match some criteria. I've tested this on a small table with just a few regions and it works fine. I'm running into issues when I execute my code on a table with 200 regions, and I'm not really sure how to resolve the issue. I'm getting a SocketTimeoutException shown below.
I'm able to run the AggregationClient coprocessor without seeing these issues. It might be something I'm doing in my code, but if anybody has any ideas why the request seems to be timing out or what I can do about it, I'd appreciate that. I'm running latest revision of hbase-0.92 and hadoop-0.20-append. My cluster has 15 regionservers. Running on RHEL 5.5, 64-bit. Some highlights of the errors are below...i've put the full thing in pastebin here: http://pastebin.com/rapYiNp3 11/06/29 17:35:12 INFO ipc.HBaseRPC: Using org.apache.hadoop.hbase.ipc.WritableRpcEngine for org.apache.hadoop.hbase.ipc.HRegionInterface 11/06/29 17:48:12 WARN client.HConnectionManager$HConnectionImplementation: Error executing for row 00223199610B220970111:2:0:7524:: java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.client.RetriesExhaustedException: Trying to contact region server mysite.com/myip:62091 for region info_test,00223199610B220970111:2:0:7524::,1309363489443.c3192674341c8d10d84966e8e663a644., row '00223199610B220970111:2:0:7524::', but failed after 10 attempts. Exceptions: java.net.SocketTimeoutException: Call to mysite.com/myip:62091 failed on socket timeout exception: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/myip:14738 remote= mysite.com/myip:62091] at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1215) at org.apache.hadoop.hbase.ipc.ExecRPCInvoker.invoke(ExecRPCInvoker.java:79) at $Proxy1.getList(Unknown Source) at org.apache.hadoop.hbase.client.coprocessor.GetListClient$1.call(GetListClient.java:108) at org.apache.hadoop.hbase.client.coprocessor.GetListClient$1.call(GetListClient.java:105) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation$4.call(HConnectionManager.java:1325) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
