Re: hbase/zookeeper

Andrew Purtell Fri, 17 Jul 2009 13:05:03 -0700

Hmm... Is that private communication or up on a Wiki somewhere? Or
maybe in a mailing list archive? We should collect these tidbits into
our wiki.


   - Andy




________________________________
From: Ryan Rawson <[email protected]>
To: [email protected]
Sent: Friday, July 17, 2009 12:57:51 PM
Subject: Re: hbase/zookeeper

The ZK folks also recommend quorums of 5 nodes.  Said something about
diminishing returns at 7 and 9...

-ryan

On Fri, Jul 17, 2009 at 12:52 PM, Andrew Purtell<[email protected]> wrote:
> Thanks. That's good advice.
>
> We tune our heap allocations based on metrics collected over typical
> and peak usage cases.
>
>   - Andy
>
>
>
>
>
> ________________________________
> From: Jonathan Gray <[email protected]>
> To: [email protected]
> Sent: Friday, July 17, 2009 12:42:30 PM
> Subject: Re: hbase/zookeeper
>
> ZK guys seem to say you should give it 1GB at least.
>
> This should not matter for 0.20.  In 0.21, our use of ZK will expand and it 
> will need more memory.  If you plan on using ZK for anything besides HBase, 
> make sure you give it more memory.  For now, you're probably okay with 256MB.
>
> Andrew Purtell wrote:
>> That looks good to me, in line with the best practices that are gelling as
>> we collectively gain operational experience.
>> This is how we allocate RAM on our 8GB worker nodes:
>>
>>   Hadoop
>>     DataNode     - 1 GB     TaskTracker  - 256 MB (JVM default)
>>     map/reduce tasks - 200 MB (Hadoop default)
>>
>>   HBase
>>     ZK           - 256 MB (JVM default)
>>     Master       - 1 GB (HBase default, but actual use is < 500MB)
>>     RegionServer - 4 GB
>>
>> We have a Master and hot spare Master each running on one of the workers.
>> Our workers are dual quad core so we have them configured for maximum
>> concurrent task execution of 4 mappers and 2 reducers and we run the
>> TaskTracker (therefore, also the tasks) with niceness +10 to hint to
>> the OS the importance of scheduling the DataNodes, ZK quorum peers, or
>> RegionServers ahead of them.
>> Note that the Hadoop NameNode is a special case which runs the NN in a
>> standalone configuration with block device level replication to a hot
>> spare configured in the typical HA fashion: heartbeat monitoring,
>> fencing via power control operations, virtual IP address and L3 fail
>> over, etc.
>> Also, not all nodes participate in the ZK ensemble. Some 2N+1 subset is
>> reasonable: 3, 5, 7, or 9. I expect that a 7 or 9 node ensemble can
>> handle 1000s of clients, if the quorum peers are running on dedicated
>> hardware. We are considering this type of deployment for the future.
>> However, for now we colocate ZK quorum peers with (some) HBase
>> regionservers.
>> Our next generation will use 32GB. This can support aggressive caching
>> and in memory tables.
>>    - Andy
>>
>>
>>
>>
>> ________________________________
>> From: Fernando Padilla <[email protected]>
>> To: [email protected]
>> Sent: Friday, July 17, 2009 10:30:52 AM
>> Subject: Re: hbase/zookeeper
>>
>> thank you!
>>
>> I'll pay attention to the CPU load then.  Any tips about the memory 
>> distribution?  This is what I'm expecting, but I'm a newb. :)
>>
>> DataNode - 1.5G
>> TaskTracker - .5G
>> Zookeeper - .5G
>> RegionServer - 2G
>> M/R - 2G
>>
>>
>>
>
>
>

Re: hbase/zookeeper

Reply via email to