It was not set in hbase-env.sh. The errors now seem to be gone.
Thanks for your prompt attention after my cry for help. On Mon, May 16, 2011 at 9:01 AM, Ted Yu <yuzhih...@gmail.com> wrote: > From hbase-default.xml: > > If HBASE_MANAGES_ZK is set in hbase-env.sh > this is the list of servers which we will start/stop ZooKeeper on. > > Normally I would let client use the same hbase-site.xml as what server > uses. > > After increasing maxClientCnxns, do you observe the same problem ? > > Cheers > > On Mon, May 16, 2011 at 6:25 AM, Barney Frank <barneyfran...@gmail.com > >wrote: > > > OK, I must be doing something wrong. This will be the death of me if I > > don't pass my scalability testing on Wednesday for my project to get > > approved. > > > > Running on version 0.90.1-cdh3u0 using the pseudo-distributed mode > > for Hadoop and Hbase. ZK mode is standalone. > > > > How can I tell if Hbase is managing ZK? I looked in the hbase-site.xml > for > > hbase server, distributed was set to true, xceivers set, and rootdir. I > > could add the hbase.zookeeper.property.maxClientCnxns here, correct? > Would > > I need to set it on the client hbase-site.xml too? > > > > Otherwise I did set the maxClientCnxns within zoo.cfg to be very large. > > > > Do I need to restart any of the servers? I have been restarting the > client > > and the hbase master and rs when changing their hbase-site.xml. > > > > FYI I am also doing a lot of (like 10/request) > mytable.incrementColumnValue > > and probably the same amount of puts. Is there a way to do an > > incrementColmnValue using puts? Maybe that would help my performance? > > > > > > > > > > > > On Sun, May 15, 2011 at 10:00 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > If you don't let hbase manage zookeeper, yes. > > > Otherwise you need to set hbase.zookeeper.property.maxClientCnxns in > > > hbase-site.xml > > > > > > Next hbase major release (with HBASE-3777) would behave much better. > > > > > > On Sun, May 15, 2011 at 7:33 PM, Barney Frank <barneyfran...@gmail.com > > > >wrote: > > > > > > > Will, do! > > > > > > > > Just in the zoo.cfg and not set it in the hbase-site.xml, correct? > > > > > > > > On Sun, May 15, 2011 at 9:20 PM, Ted Yu <yuzhih...@gmail.com> wrote: > > > > > > > > > Please increase the max client connections, dramatically. > > > > > > > > > > > > > > > > > > > > On May 15, 2011, at 6:15 PM, Barney Frank <barneyfran...@gmail.com > > > > > > wrote: > > > > > > > > > > > I am looking for some advice on any changes to minimize these > > errors. > > > > > > > > > > > > Running Hbase standalone on version cdh3u0 and set Zoo.cfg to be > > 300 > > > > max > > > > > > client connections. I use only the java api and use new Htable() > > for > > > > each > > > > > > request (no pooling). > > > > > > > > > > > > Running at lower volumes (50 requests/sec), I do not have any > > > > performance > > > > > > issues. At high volumes of read/write requests (~200 > requests/sec) > > > via > > > > > the > > > > > > java API, I see the following exceptions in my client (JBOSS) > logs: > > > > > > > > > > > > FYI, once the volumes decrease, everything seems to recover > nicely. > > > > > > > > > > > > 2011-05-16 00:40:07,344 INFO [org.apache.zookeeper.ClientCnxn] > > > > > > > > (http-0.0.0.0-8080-43-SendThread(ip-10-46-181-169.ec2.internal:2181)) > > > > > Client > > > > > > session timed out, have not heard from server in 32852ms for > > > sessionid > > > > > > 0x12fd6beb2180378, closing socket connection and attempting > > reconnect > > > > > > 2011-05-16 00:40:07,344 INFO [org.apache.zookeeper.ClientCnxn] > > > > > > > > (http-0.0.0.0-8443-5-SendThread(ip-10-46-181-169.ec2.internal:2181)) > > > > > Client > > > > > > session timed out, have not heard from server in 32599ms for > > > sessionid > > > > > > 0x12fd6beb2180379, closing socket connection and attempting > > reconnect > > > > > > 2011-05-16 00:40:07,345 INFO [org.apache.zookeeper.ClientCnxn] > > > > > (Interaction > > > > > > Logger Wrapup-SendThread(ip-10-46-181-169.ec2.internal:2181)) > > Client > > > > > session > > > > > > timed out, have not heard from server in 32849ms for sessionid > > > > > > 0x12fd6beb2180377, closing socket connection and attempting > > reconnect > > > > > > 2011-05-16 00:40:07,345 INFO [org.apache.zookeeper.ClientCnxn] > > > > (Contact > > > > > > History-SendThread(ip-10-46-181-169.ec2.internal:2181)) Client > > > session > > > > > timed > > > > > > out, have not heard from server in 32850ms for sessionid > > > > > 0x12fd6beb2180376, > > > > > > closing socket connection and attempting reconnect > > > > > > 2011-05-16 00:40:07,345 INFO [org.apache.zookeeper.ClientCnxn] > > > > > > (Timer-0-SendThread(ip-10-46-181-169.ec2.internal:2181)) Client > > > session > > > > > > timed out, have not heard from server in 32850ms for sessionid > > > > > > 0x12fd6beb2180371, closing socket connection and attempting > > reconnect > > > > > > 2011-05-16 00:40:07,369 INFO [org.apache.zookeeper.ClientCnxn] > > > > > > (Timer-0-SendThread(ip-10-46-181-169.ec2.internal:2181)) Client > > > session > > > > > > timed out, have not heard from server in 42353ms for sessionid > > > > > > 0x12fd6beb2180372, closing socket connection and attempting > > reconnect > > > > > > 2011-05-16 00:40:07,369 INFO [org.apache.zookeeper.ClientCnxn] > > > > > > > > (http-0.0.0.0-8080-1-SendThread(ip-10-46-181-169.ec2.internal:2181)) > > > > > Client > > > > > > session timed out, have not heard from server in 42353ms for > > > sessionid > > > > > > 0x12fd6beb2180375, closing socket connection and attempting > > reconnect > > > > > > 2011-05-16 00:40:07,370 INFO [org.apache.zookeeper.ClientCnxn] > > > > > > (Timer-0-SendThread(ip-10-46-181-169.ec2.internal:2181)) Client > > > session > > > > > > timed out, have not heard from server in 42386ms for sessionid > > > > > > 0x12fd6beb2180373, closing socket connection and attempting > > reconnect > > > > > > 2011-05-16 00:40:07,369 INFO [org.apache.zookeeper.ClientCnxn] > > > > > > (Timer-0-SendThread(ip-10-46-181-169.ec2.internal:2181)) Client > > > session > > > > > > timed out, have not heard from server in 42368ms for sessionid > > > > > > 0x12fd6beb2180374, closing socket connection and attempting > > reconnect > > > > > > 011-05-16 00:40:07,445 DEBUG > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] > > > > > > (http-0.0.0.0-8080-43-EventThread) hconnection-0x12fd6beb2180378 > > > > Received > > > > > > ZooKeeper Event, type=None, state=Disconnected, path=null > > > > > > 2011-05-16 00:40:07,445 DEBUG > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] > > > > > > (http-0.0.0.0-8080-43-EventThread) hconnection-0x12fd6beb2180378 > > > > Received > > > > > > Disconnected from ZooKeeper, ignoring > > > > > > 2011-05-16 00:40:07,445 DEBUG > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] (Interaction > > > > Logger > > > > > > Wrapup-EventThread) hconnection-0x12fd6beb2180377 Received > > ZooKeeper > > > > > Event, > > > > > > type=None, state=Disconnected, path=null > > > > > > 2011-05-16 00:40:07,445 DEBUG > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] (Interaction > > > > Logger > > > > > > Wrapup-EventThread) hconnection-0x12fd6beb2180377 Received > > > Disconnected > > > > > from > > > > > > ZooKeeper, ignoring > > > > > > 2011-05-16 00:40:07,445 DEBUG > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] (Contact > > > > > > History-EventThread) hconnection-0x12fd6beb2180376 Received > > ZooKeeper > > > > > Event, > > > > > > type=None, state=Disconnected, path=null > > > > > > 2011-05-16 00:40:07,445 DEBUG > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] > > > > > (Timer-0-EventThread) > > > > > > hconnection-0x12fd6beb2180371 Received ZooKeeper Event, > type=None, > > > > > > state=Disconnected, path=null > > > > > > 2011-05-16 00:40:07,446 DEBUG > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] (Contact > > > > > > History-EventThread) hconnection-0x12fd6beb2180376 Received > > > > Disconnected > > > > > > from ZooKeeper, ignoring > > > > > > 2011-05-16 00:40:07,446 DEBUG > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] > > > > > (Timer-0-EventThread) > > > > > > hconnection-0x12fd6beb2180371 Received Disconnected from > ZooKeeper, > > > > > ignoring > > > > > > 2011-05-16 00:40:07,454 DEBUG > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] > > > > > > (http-0.0.0.0-8443-5-EventThread) hconnection-0x12fd6beb2180379 > > > > Received > > > > > > ZooKeeper Event, type=None, state=Disconnected, path=null > > > > > > 2011-05-16 00:40:07,454 DEBUG > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] > > > > > > (http-0.0.0.0-8443-5-EventThread) hconnection-0x12fd6beb2180379 > > > > Received > > > > > > Disconnected from ZooKeeper, ignoring > > > > > > 2011-05-16 00:40:07,447 ERROR > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] > > > > > (http-0.0.0.0-8443-5) > > > > > > hconnection-0x12fd6beb2180379 Unexpected KeeperException creating > > > base > > > > > node: > > > > > > org.apache.zookeeper.KeeperException$ConnectionLossException: > > > > > > KeeperErrorCode = ConnectionLoss for /hbase/unassigned > > > > > > at > > > > > > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:90) > > > > > > [:3.3.3-cdh3u0--1] > > > > > > at > > > > > > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > > > > > > [:3.3.3-cdh3u0--1] > > > > > > at > org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637) > > > > > > [:3.3.3-cdh3u0--1] > > > > > > at > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:902) > > > > > > [:] > > > > > > at > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:148) > > > > > > [:] > > > > > > at > > > > > > > > > > > > > > > > > > > > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530) > > > > > > [:3.3.3-cdh3u0--1] > > > > > > at > > > > > > > > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506) > > > > > > [:3.3.3-cdh3u0--1] > > > > > > 2011-05-16 00:40:19,728 INFO [org.apache.zookeeper.ClientCnxn] > > > > > > (http-0.0.0.0-8443-5-EventThread) EventThread shut down > > > > > > 2011-05-16 00:40:19,729 INFO [org.apache.zookeeper.ClientCnxn] > > > > > > > > (http-0.0.0.0-8443-5-SendThread(ip-10-46-181-169.ec2.internal:2181)) > > > > > Unable > > > > > > to reconnect to ZooKeeper service, session 0x12fd6beb2180379 has > > > > expired, > > > > > > closing socket connection > > > > > > 2011-05-16 00:40:19,730 DEBUG > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] > > > > > (Timer-0-EventThread) > > > > > > hconnection-0x12fd6beb2180374 Received ZooKeeper Event, > type=None, > > > > > > state=Expired, path=null > > > > > > 2011-05-16 00:40:19,730 INFO > > > > > > > > > > > > > > > > > > > > > [org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation] > > > > > > (Timer-0-EventThread) This client just lost it's session with > > > > ZooKeeper, > > > > > > trying to reconnect. > > > > > > 2011-05-16 00:40:19,730 INFO > > > > > > > > > > > > > > > > > > > > > [org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation] > > > > > > (Timer-0-EventThread) Trying to reconnect to zookeeper > > > > > > 2011-05-16 00:40:19,731 DEBUG > > > > [org.apache.hadoop.hbase.zookeeper.ZKUtil] > > > > > > (Timer-0-EventThread) hconnection opening connection to ZooKeeper > > > with > > > > > > ensemble (10.46.181.169:2181) > > > > > > 2011-05-16 00:40:19,731 INFO [org.apache.zookeeper.ZooKeeper] > > > > > > (Timer-0-EventThread) Initiating client connection, > connectString= > > > > > > 10.46.181.169:2181 sessionTimeout=180000 watcher=hconnection > > > > > > 2011-05-16 00:40:19,732 INFO [org.apache.zookeeper.ClientCnxn] > > > > > > (Timer-0-SendThread(ip-10-46-181-169.ec2.internal:2181)) Unable > to > > > > > reconnect > > > > > > to ZooKeeper service, session 0x12fd6beb2180374 has expired, > > closing > > > > > socket > > > > > > connection > > > > > > > > > > > > *** I get a bunch of these *** > > > > > > 2011-05-16 00:40:19,847 WARN > > > > [org.apache.hadoop.hbase.zookeeper.ZKUtil] > > > > > > (Interaction Logger Wrapup) hconnection-0x12fd6beb2180377 Unable > to > > > get > > > > > > children of node /hbase/rs > > > > > > > > > > > > > > > > > > *** Then a bunch of these *** > > > > > > 2011-05-16 00:40:19,881 ERROR > > > > > > [org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher] (Interaction > > > > Logger > > > > > > Wrapup) hconnection-0x12fd6beb2180377 Received unexpected > > > > > KeeperException, > > > > > > re-throwing exception: > > > > > > org.apache.zookeeper.KeeperException$SessionExpiredException: > > > > > > KeeperErrorCode = Session expired for /hbase/rs > > > > > > at > > > > > > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:118) > > > > > > [:3.3.3-cdh3u0--1] > > > > > > at > > > > > > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > > > > > > [:3.3.3-cdh3u0--1] > > > > > > at > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809) > > > > > > [:3.3.3-cdh3u0--1] > > > > > > at > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.zookeeper.ZKUtil.getNumberOfChildren(ZKUtil.java:495) > > > > > > [:] > > > > > > at > > > > > > > > > org.apache.hadoop.hbase.client.HTable.getCurrentNrHRS(HTable.java:207) > > > > > [:] > > > > > > at > > > org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:182) > > > > > [:] > > > > > > at > > > org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:145) > > > > > [:] > > > > > > at InteractionLogger.run(InteractionLogger.java:139) [:] > > > > > > at java.lang.Thread.run(Thread.java:662) [:1.6.0_24] > > > > > > > > > > > > ***Then a lot of these *** > > > > > > 2011-05-16 00:42:13,789 WARN [InteractionLogger] (Interaction > > Logger > > > > > > Wrapup) java.io.IOException: Unexpected ZooKeeper exception > > > > > > at > > > > > > > > > org.apache.hadoop.hbase.client.HTable.getCurrentNrHRS(HTable.java:210) > > > > > [:] > > > > > > at > > > org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:182) > > > > > [:] > > > > > > at > > > org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:145) > > > > > [:] > > > > > > at > > > > > > > stateful.session.InteractionLogger.run(InteractionLogger.java:139) > > > [:] > > > > > > at java.lang.Thread.run(Thread.java:662) [:1.6.0_24] > > > > > > Caused by: > > > > org.apache.zookeeper.KeeperException$SessionExpiredException: > > > > > > KeeperErrorCode = Session expired for /hbase/rs > > > > > > at > > > > > > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:118) > > > > > > [:3.3.3-cdh3u0--1] > > > > > > at > > > > > > > > org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > > > > > > [:3.3.3-cdh3u0--1] > > > > > > at > org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:809) > > > > > > [:3.3.3-cdh3u0--1] > > > > > > at > > > > > > > > > > > > > > > > > > > > > org.apache.hadoop.hbase.zookeeper.ZKUtil.getNumberOfChildren(ZKUtil.java:495) > > > > > > [:] > > > > > > at > > > > > > > > > org.apache.hadoop.hbase.client.HTable.getCurrentNrHRS(HTable.java:207) > > > > > [:] > > > > > > ... 4 more > > > > > > > > > > > > Any help would be greatly appreciated. > > > > > > > > > > > > > > >