Re: How does HBase perform load balancing?

MauMau Sat, 08 May 2010 17:49:22 -0700

Hi, Ryan

From: "Ryan Rawson" <[email protected]>

Here at Stumbleupon we handle 12,000 requests/second, some
regionservers are a bit warmer than others, but it hasnt proven to be
a serious issue.

Thank you for sharing your precious experience. "12,000 requests/second" isgreat!

If you want more head room you add servers. HBase will reassign
regions to those regionservers. You now have access to more CPU and
RAM and have a larger and more effective block cache. The data doesn't
get spread around, but you can initiate major compactions on some/all
of the tables which will move data around immediately.  There are no
concerns for growing a cluster in this way - I have done it to double
the size of a cluster and I saw immediate performance.  I major
compacted a table I was doing a map reduce on and I saw more
performance improvements.  In a live serving system you do NOT want to
be accessing disk most the time - caching is the name of the game for
reducing latency.  Everyone does this (you think your google results
are read from disk?) and it's a fairly uniform "law" of doing low
latency services - RAM is king. And when you expand a HBase cluster
you get more effective ram immediately - no rebalancing required
(unlike DHT-based architectures).

What I understood from the above is as follows. I'd appreciate if you couldpoint out if I am wrong.

1. I need to perform major-compaction to unassign regions from the existingloaded region servers to a new region server.I cannot reassign the regions just by doing minor compaction and letting thenon-loaded new server perform major compaction later. Having the loadedexisting server do heavy major compaction is a concern.2. "no rebalancing required" means that the blocks of HDFS files for regionsneed not be moved from one datanode to another.


Thank you.
Maumau

Re: How does HBase perform load balancing?

Reply via email to