You may colocate your ZK with the HBase Master as its not very heavy. Depending on your cluster size, 1-3 may be enough and you can divide it among HBM, SNN and perhaps NN/JT machines.
On Wed, May 30, 2012 at 2:54 AM, Something Something <[email protected]> wrote: > Hmm.. due to budget constraints, I am forced to install ZooKeeper on the > same machine that runs TaskTracker. When a big MR job starts it fires up > over 40 tasks, so as you implied this could definitely be related to memory. > > Should ZooKeepers be started on their own machines? Right now I have > ZooKeeper, HRegionServer & TaskTracker running on the same machine. This > is a bad idea, right? Is there any way to get ZooKeeper working under > these restrictions? > > By the way, the ZooKeeper log shows this: > > 2012-05-29 13:56:54,842 - ERROR [CommitProcessor:2:NIOServerCnxn@445] - > Unexpected Exception: > java.nio.channels.CancelledKeyException > at sun.nio.ch.SelectionKeyImpl.ensureValid(SelectionKeyImpl.java:55) > at sun.nio.ch.SelectionKeyImpl.interestOps(SelectionKeyImpl.java:59) > at > org.apache.zookeeper.server.NIOServerCnxn.sendBuffer(NIOServerCnxn.java:418) > at > org.apache.zookeeper.server.NIOServerCnxn.sendResponse(NIOServerCnxn.java:1509) > at > org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:367) > at > org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73) > > > > > On Sat, May 26, 2012 at 2:28 AM, Christian Schäfer > <[email protected]>wrote: > >> >> Hi, >> >> I got exactly the same behaviour and exceptions that you mention on a >> local cluster. >> >> In my case the sum of all services' heapspace was higher than the actual >> memory of the machine. >> At >> first sum the heapspaces of your master machine likely running >> NameNode, HMaster, ZooKeeper, and maybe also, RegionServer and DataNode >> Then check that this sum is lesser than your master machines memory. >> >> Good Luck. >> Chris >> >> Von: Something Something <[email protected]> >> An: >> [email protected]; [email protected] >> Gesendet: 3:22 Samstag, 26.Mai 2012 >> Betreff: HBase dies after some time >> >> Hello, >> >> I recently installed ZooKeeper & HBase on our dedicated Hadoop cluster on >> EC2. The HBase stays active for some time, but after a while it dies with >> error messages similar to these: >> >> 2012-05-25 12:09:27,514 ERROR >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher: >> master:60000-0x5378489312c0004-0x5378489312c0004 Received unexpected >> KeeperException, re-throwing exception >> org.apache.zookeeper.KeeperException$ConnectionLossException: >> KeeperErrorCode = ConnectionLoss for /hbase/master >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:90) >> >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:42) >> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) >> at >> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549) >> at >> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:620) >> at >> >> org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:197) >> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:310) >> 2012-05-25 12:09:27,514 ERROR >> org.apache.hadoop.hbase.master.ActiveMasterManager: >> master:60000-0x5378489312c0004-0x5378489312c0004 Error deleting our own >> master address node >> org.apache.zookeeper.KeeperException$ConnectionLossException: >> KeeperErrorCode = ConnectionLoss for /hbase/master >> >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:90) >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:42) >> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:927) >> at >> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549) >> at >> org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:620) >> at >> >> org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:197) >> at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:310) >> >> >> This kills the HMaster as well as all HRegionServers. Could it be that my >> ZooKeeper setup is incorrect? Please help. Thanks. >> -- Harsh J
