Your aggregation coprocessor seems to be global -- meaning that it uses the Table scanner which can go to other regions on other region servers. Do you have to do so? The coprocessor is hosted on a Region. You can just get a local region scanner to do local scan. This won't work for you? This will save you all the ZK connections. Also co-processors that go global can cause deadlock if you are not careful.
Jerry On Sat, Jun 4, 2016 at 5:05 PM, Billy Watson <[email protected]> wrote: > Did you try upping the connections to 100 or so? We turned the max client > connections off and haven't really noticed any detriments to that yet and > we've been running some really big jobs with only 3 zk nodes... > > William Watson > Lead Software Engineer > > On Thu, Mar 3, 2016 at 12:18 PM, Arul <[email protected]> wrote: > > > Hi, > > > > I am using hbase version 1.1.2 and implemented co-processor to do > > aggregation. Today i had faced issues where request from application > > getting > > stuck and in the zookeper logs i found error 60 connections exceeded. I > > took > > thread dump of the region server and found that i all threads are stuck > > while executing co-processor. Can you please point me what could be > issue. > > Please find the part of thread dump.Thanks in advance. > > > > "B.defaultRpcServer.handler=29,queue=2,port=16020-EventThread" daemon > > prio=10 tid=0x00007f904c32f000 nid=0x73c4 waiting on condition > > [0x00007f9024cda000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0x00000000de476550> (a > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > at > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > > at > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) > > at > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > > at > > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494) > > > > > > > "B.defaultRpcServer.handler=29,queue=2,port=16020-SendThread(hbasemasterbkup:2181)" > > daemon prio=10 tid=0x00007f904c32e000 nid=0x73c3 waiting on condition > > [0x00007f9024ad8000] > > java.lang.Thread.State: TIMED_WAITING (sleeping) > > at java.lang.Thread.sleep(Native Method) > > at > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:994) > > > > "B.defaultRpcServer.handler=29,queue=2,port=16020" daemon prio=10 > > tid=0x00007f9055186000 nid=0x46be waiting on condition > [0x00007f903d35e000] > > java.lang.Thread.State: TIMED_WAITING (sleeping) > > at java.lang.Thread.sleep(Native Method) > > at java.lang.Thread.sleep(Thread.java:340) > > at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:360) > > at > > > > > org.apache.hadoop.hbase.util.RetryCounter.sleepUntilNextRetry(RetryCounter.java:158) > > at > > > > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:373) > > at > > org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:745) > > at > > > > > org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionState(MetaTableLocator.java:482) > > at > > > > > org.apache.hadoop.hbase.zookeeper.MetaTableLocator.getMetaRegionLocation(MetaTableLocator.java:168) > > at > > > > > org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:600) > > at > > > > > org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:580) > > at > > > > > org.apache.hadoop.hbase.zookeeper.MetaTableLocator.blockUntilAvailable(MetaTableLocator.java:559) > > at > > > > > org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:61) > > at > > > > > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateMeta(ConnectionManager.java:1185) > > - locked <0x00000000de477fd0> (a java.lang.Object) > > at > > > > > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1152) > > at > > > > > org.apache.hadoop.hbase.client.CoprocessorHConnection.locateRegion(CoprocessorHConnection.java:41) > > at > > > > > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300) > > at > > > > > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:151) > > at > > > > > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59) > > at > > > > > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) > > at > > > > > org.apache.hadoop.hbase.client.ClientSmallReversedScanner.loadCache(ClientSmallReversedScanner.java:211) > > at > > > > > org.apache.hadoop.hbase.client.ClientSmallReversedScanner.next(ClientSmallReversedScanner.java:185) > > at > > > > > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegionInMeta(ConnectionManager.java:1249) > > at > > > > > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1155) > > at > > > > > org.apache.hadoop.hbase.client.CoprocessorHConnection.locateRegion(CoprocessorHConnection.java:41) > > at > > > > > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300) > > at > > > > > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:151) > > at > > > > > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59) > > at > > > > > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) > > at > > org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320) > > at > > > > > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295) > > at > > > > > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160) > > at > > > org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155) > > at > > org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:821) > > at > > > > > org.apache.hadoop.hbase.client.HTableWrapper.getScanner(HTableWrapper.java:215) > > at > > > > > com.test.ceh.management.coprocessor.enpoint.averagemetrics.MetricsAvgEndPoint.getMetricsAvgMap(MetricsAvgEndPoint.java:117) > > at > > > > > com.test.ceh.management.coprocessor.enpoint.averagemetrics.MetircsAverageOfListOfInstances$AvgService.callMethod(MetircsAverageOfListOfInstances.java:3685) > > at > > > org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7435) > > at > > > > > org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1875) > > at > > > > > org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1857) > > at > > > > > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209) > > at > org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114) > > at > org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) > > at > > > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > > at > > org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > > at java.lang.Thread.run(Thread.java:745) > > > > "B.defaultRpcServer.handler=29,queue=2,port=16020-EventThread" daemon > > prio=10 tid=0x00007f904d3b3000 nid=0x6f55 waiting on condition > > [0x00007f90325b2000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0x00000000ddf326f0> (a > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > at > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > > at > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) > > at > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > > at > > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494) > > > > > > > "B.defaultRpcServer.handler=29,queue=2,port=16020-SendThread(zookeeper1:2181)" > > daemon prio=10 tid=0x00007f904d3b2000 nid=0x6f54 runnable > > [0x00007f90233c1000] > > java.lang.Thread.State: RUNNABLE > > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > > at > sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) > > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) > > - locked <0x00000000ddf32b28> (a sun.nio.ch.Util$2) > > - locked <0x00000000ddf32b10> (a > > java.util.Collections$UnmodifiableSet) > > - locked <0x00000000e907d7d8> (a sun.nio.ch.EPollSelectorImpl) > > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) > > at > > > > > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349) > > at > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > > > > "B.defaultRpcServer.handler=29,queue=2,port=16020-EventThread" daemon > > prio=10 tid=0x00007f904d3b3000 nid=0x6f55 waiting on condition > > [0x00007f90325b2000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0x00000000ddf326f0> (a > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > at > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > > at > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) > > at > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > > at > > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494) > > > > "B.defaultRpcServer.handler=29,queue=2,port=16020-EventThread" daemon > > prio=10 tid=0x00007f904cc11800 nid=0x6c84 waiting on condition > > [0x00007f9020c9a000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0x00000000dded5cc0> (a > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > at > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > > at > > > > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) > > at > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > > at > > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:494) > > > > > > > "B.defaultRpcServer.handler=29,queue=2,port=16020-SendThread(zookeeper1:2181)" > > daemon prio=10 tid=0x00007f904c819800 nid=0x6c83 runnable > > [0x00007f9020d9b000] > > java.lang.Thread.State: RUNNABLE > > at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) > > at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) > > at > sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) > > at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87) > > - locked <0x00000000dded60f8> (a sun.nio.ch.Util$2) > > - locked <0x00000000dded60e0> (a > > java.util.Collections$UnmodifiableSet) > > - locked <0x00000000ea85fe08> (a sun.nio.ch.EPollSelectorImpl) > > at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98) > > at > > > > > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349) > > at > > org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) > > > > > > > > > > > > > > > > -- > > View this message in context: > > > http://apache-hbase.679495.n3.nabble.com/Zookeeper-too-many-connections-when-using-co-processor-tp4078265.html > > Sent from the HBase User mailing list archive at Nabble.com. > > >
