Hello: I'm running Nutch 2.2.1 and HBase 0.90.6 in standalone mode. And I'm finding that my HBase instance is running out of connections. I think the default is 30 and I reset to 300, but eventually I'm finding I'm running out of open connections. It just takes a little longer with the higher number.
Is this just an issue in standalone mode? I'm not sure why Nutch wouldn't be reusing open connections. I'm hoping there is just some simple configuration that is responsible. Has anyone had a similar issue and could point me in the write direction. The error that I'm seeing is the following: 2014-01-09 15:02:12,763 ERROR crawl.GeneratorJob - GeneratorJob: org.apache.gora.util.GoraException: java.lang.RuntimeException: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information. at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167) at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135) at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75) at org.apache.nutch.storage.StorageUtils.initMapperJob(StorageUtils.java:119) at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:196) at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:223) at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:279) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:287) Caused by: java.lang.RuntimeException: org.apache.hadoop.hbase.ZooKeeperConnectionException: HBase is able to connect to ZooKeeper but the connection closes immediately. This could be a sign that the server has too many connections (30 is the default). Consider inspecting your ZK server logs for that error and then make sure you are reusing HBaseConfiguration as often as you can. See HTable's javadoc for more information. at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:127) at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102) at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161) ... 8 more -- Ward Loving Technical Architect Appirio, Inc.

