This is the relevant graph.  I've heard 7 being recommended, but 5 or
7 seem to be the best options.
http://hadoop.apache.org/zookeeper/docs/current/zookeeperOver.html#Performance

Michael

On Fri, Jul 17, 2009 at 3:04 PM, Andrew Purtell<[email protected]> wrote:
> Hmm... Is that private communication or up on a Wiki somewhere? Or
> maybe in a mailing list archive? We should collect these tidbits into
> our wiki.
>
>   - Andy
>
>
>
>
> ________________________________
> From: Ryan Rawson <[email protected]>
> To: [email protected]
> Sent: Friday, July 17, 2009 12:57:51 PM
> Subject: Re: hbase/zookeeper
>
> The ZK folks also recommend quorums of 5 nodes.  Said something about
> diminishing returns at 7 and 9...
>
> -ryan
>
> On Fri, Jul 17, 2009 at 12:52 PM, Andrew Purtell<[email protected]> wrote:
>> Thanks. That's good advice.
>>
>> We tune our heap allocations based on metrics collected over typical
>> and peak usage cases.
>>
>>   - Andy
>>
>>
>>
>>
>>
>> ________________________________
>> From: Jonathan Gray <[email protected]>
>> To: [email protected]
>> Sent: Friday, July 17, 2009 12:42:30 PM
>> Subject: Re: hbase/zookeeper
>>
>> ZK guys seem to say you should give it 1GB at least.
>>
>> This should not matter for 0.20.  In 0.21, our use of ZK will expand and it 
>> will need more memory.  If you plan on using ZK for anything besides HBase, 
>> make sure you give it more memory.  For now, you're probably okay with 256MB.
>>
>> Andrew Purtell wrote:
>>> That looks good to me, in line with the best practices that are gelling as
>>> we collectively gain operational experience.
>>> This is how we allocate RAM on our 8GB worker nodes:
>>>
>>>   Hadoop
>>>     DataNode     - 1 GB     TaskTracker  - 256 MB (JVM default)
>>>     map/reduce tasks - 200 MB (Hadoop default)
>>>
>>>   HBase
>>>     ZK           - 256 MB (JVM default)
>>>     Master       - 1 GB (HBase default, but actual use is < 500MB)
>>>     RegionServer - 4 GB
>>>
>>> We have a Master and hot spare Master each running on one of the workers.
>>> Our workers are dual quad core so we have them configured for maximum
>>> concurrent task execution of 4 mappers and 2 reducers and we run the
>>> TaskTracker (therefore, also the tasks) with niceness +10 to hint to
>>> the OS the importance of scheduling the DataNodes, ZK quorum peers, or
>>> RegionServers ahead of them.
>>> Note that the Hadoop NameNode is a special case which runs the NN in a
>>> standalone configuration with block device level replication to a hot
>>> spare configured in the typical HA fashion: heartbeat monitoring,
>>> fencing via power control operations, virtual IP address and L3 fail
>>> over, etc.
>>> Also, not all nodes participate in the ZK ensemble. Some 2N+1 subset is
>>> reasonable: 3, 5, 7, or 9. I expect that a 7 or 9 node ensemble can
>>> handle 1000s of clients, if the quorum peers are running on dedicated
>>> hardware. We are considering this type of deployment for the future.
>>> However, for now we colocate ZK quorum peers with (some) HBase
>>> regionservers.
>>> Our next generation will use 32GB. This can support aggressive caching
>>> and in memory tables.
>>>    - Andy
>>>
>>>
>>>
>>>
>>> ________________________________
>>> From: Fernando Padilla <[email protected]>
>>> To: [email protected]
>>> Sent: Friday, July 17, 2009 10:30:52 AM
>>> Subject: Re: hbase/zookeeper
>>>
>>> thank you!
>>>
>>> I'll pay attention to the CPU load then.  Any tips about the memory 
>>> distribution?  This is what I'm expecting, but I'm a newb. :)
>>>
>>> DataNode - 1.5G
>>> TaskTracker - .5G
>>> Zookeeper - .5G
>>> RegionServer - 2G
>>> M/R - 2G
>>>
>>>
>>>
>>
>>
>>
>
>
>
>

Reply via email to