I would suggest that you have each mapper have its own HTable, rather than having a static HTable in the outer class. Configure it from the setup method of the mapper.
Hmm..., I am not exactly sure how the configuration from your HTable is passed to the mapper in the first place. You are configuring it in the DomainTableTransform which is not run on when instantiating the individual mappers (hence it is a public static class). I don't think your code actually works at all, even for a little bit. Dave -----Original Message----- From: Jonathan Bender [mailto:[email protected]] Sent: Friday, March 25, 2011 11:16 AM To: [email protected] Subject: Zookeeper connection error on mapreduce HBase writes Hello all, I wrote a routine that scans an HBase table, and writes to another table from within the map function using HTable.put(). When I run the job, it works fine for the first few rows but ZooKeeper starts having issues opening up a connection after a while. Am I just overloading the ZK server by opening up a new HTable connection during each map() task, or is there something else wrong with my configuration? Any other suggestions for reading and writing tables directly via MapReduce? Here's the syslog from my tasktracker node: http://pastebin.com/VzUs3TJP And the error log: http://pastebin.com/9gBRyR4e And here's the MapReduce code I am running: http://pastebin.com/ir3yWaR1 Excerpt of the error: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:988) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.setupZookeeperTrackers(HConnectionManager.java:301) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.<init>(HConnectionManager.java:292) at org.apache.hadoop.hbase.client.HConnectionManager.getConnection(HConnectionManager.java:155) at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:167) at org.apache.hadoop.hbase.client.HTable.<init>(HTable.java:145) at com.mydomain.emir.mapreduce.DomainTableTransform$myMapper.map(DomainTableTransform.java:82) at com.mydomain.emir.mapreduce.DomainTableTransform$myMapper.map(DomainTableTransform.java:1) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:646) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322) at org.apache.hadoop.mapred.Child$4.run(Child.java:240) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115) at org.apache.hadoop.mapred.Child.main(Child.java:234) Caused by: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:147) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getZooKeeperWatcher(HConnectionManager.java:986) ... 15 more
