This is the relevant graph. I've heard 7 being recommended, but 5 or 7 seem to be the best options. http://hadoop.apache.org/zookeeper/docs/current/zookeeperOver.html#Performance
Michael On Fri, Jul 17, 2009 at 3:04 PM, Andrew Purtell<[email protected]> wrote: > Hmm... Is that private communication or up on a Wiki somewhere? Or > maybe in a mailing list archive? We should collect these tidbits into > our wiki. > > - Andy > > > > > ________________________________ > From: Ryan Rawson <[email protected]> > To: [email protected] > Sent: Friday, July 17, 2009 12:57:51 PM > Subject: Re: hbase/zookeeper > > The ZK folks also recommend quorums of 5 nodes. Said something about > diminishing returns at 7 and 9... > > -ryan > > On Fri, Jul 17, 2009 at 12:52 PM, Andrew Purtell<[email protected]> wrote: >> Thanks. That's good advice. >> >> We tune our heap allocations based on metrics collected over typical >> and peak usage cases. >> >> - Andy >> >> >> >> >> >> ________________________________ >> From: Jonathan Gray <[email protected]> >> To: [email protected] >> Sent: Friday, July 17, 2009 12:42:30 PM >> Subject: Re: hbase/zookeeper >> >> ZK guys seem to say you should give it 1GB at least. >> >> This should not matter for 0.20. In 0.21, our use of ZK will expand and it >> will need more memory. If you plan on using ZK for anything besides HBase, >> make sure you give it more memory. For now, you're probably okay with 256MB. >> >> Andrew Purtell wrote: >>> That looks good to me, in line with the best practices that are gelling as >>> we collectively gain operational experience. >>> This is how we allocate RAM on our 8GB worker nodes: >>> >>> Hadoop >>> DataNode - 1 GB TaskTracker - 256 MB (JVM default) >>> map/reduce tasks - 200 MB (Hadoop default) >>> >>> HBase >>> ZK - 256 MB (JVM default) >>> Master - 1 GB (HBase default, but actual use is < 500MB) >>> RegionServer - 4 GB >>> >>> We have a Master and hot spare Master each running on one of the workers. >>> Our workers are dual quad core so we have them configured for maximum >>> concurrent task execution of 4 mappers and 2 reducers and we run the >>> TaskTracker (therefore, also the tasks) with niceness +10 to hint to >>> the OS the importance of scheduling the DataNodes, ZK quorum peers, or >>> RegionServers ahead of them. >>> Note that the Hadoop NameNode is a special case which runs the NN in a >>> standalone configuration with block device level replication to a hot >>> spare configured in the typical HA fashion: heartbeat monitoring, >>> fencing via power control operations, virtual IP address and L3 fail >>> over, etc. >>> Also, not all nodes participate in the ZK ensemble. Some 2N+1 subset is >>> reasonable: 3, 5, 7, or 9. I expect that a 7 or 9 node ensemble can >>> handle 1000s of clients, if the quorum peers are running on dedicated >>> hardware. We are considering this type of deployment for the future. >>> However, for now we colocate ZK quorum peers with (some) HBase >>> regionservers. >>> Our next generation will use 32GB. This can support aggressive caching >>> and in memory tables. >>> - Andy >>> >>> >>> >>> >>> ________________________________ >>> From: Fernando Padilla <[email protected]> >>> To: [email protected] >>> Sent: Friday, July 17, 2009 10:30:52 AM >>> Subject: Re: hbase/zookeeper >>> >>> thank you! >>> >>> I'll pay attention to the CPU load then. Any tips about the memory >>> distribution? This is what I'm expecting, but I'm a newb. :) >>> >>> DataNode - 1.5G >>> TaskTracker - .5G >>> Zookeeper - .5G >>> RegionServer - 2G >>> M/R - 2G >>> >>> >>> >> >> >> > > > >
