Re: Fix Number of Regions per Node ?

Michael Segel Mon, 22 Jun 2015 13:03:07 -0700

This issue started to poke its head when companies started to adopt Hadoop.

In terms of managing it… pre CM, Ambari, you had to manage your own class of 
nodes and sets of configuration files. 

Ambari is supposed to be able to handle multiple configurations by now. (If 
not… then they are all a bunch of slackers because they’ve had a year to fix 
it!!! :-P ) 

Does HBase look at the RS as if it were a container and then manage the 
workload / workflow based on what that specific container can do? 
Probably not and there are a couple of ways of looking at this… 

1) HBase is outside of YARN.  (Forget slider / or whatever they are calling 
hoya these days. ) 
You set up a certain amount of resources for HBase and then you leave the rest 
to YARN. 

This means that regardless of the changes in architecture, you should get the 
same performance, or roughly the same performance. 

2) Retiring Hardware.  Moore’s law == 18 months in a generation.  So within 2 
generations you have 3 years which tend to be the limits on warranties. 
Assuming that you have managers that want to squeeze in a third generation, 
that’s 4.5 years which means your kit should be put out to pasture and 
replaced. 

This doesn’t really change because once the hardware is out of warranty, it 
dies, you’re pretty much screwed and need to replace it anyways. 

The point is that you should be able to keep 1-2 generational hardware configs 
working in the same cluster. 

3) Upgrades. 
You may have limits on CPU, but you should be able to upgrade your memory, NIC 
cards, drives, etc … so that you could extend the lives of the older hardware 
to reach that 4.5 year cycle. 
This would/should be cheaper than a complete upgrade. 

So if you have multiple hardware configurations. Tune for HBase and let Yarn 
worry about the size of the containers for other (M/R) to run.  

Think of it this way… I have different sized pizza boxes. If my pizza is cut 
pretty much the same size and that size fits in all of the boxes, I’m ok. 
If I want a larger sized pizza, but I can’t fit it in to all of the boxes.. 
then you can always remove those boxes and not use them. 

Your pizza is homogenous… your box size is not. 

Does that make sense? 

> On Jun 17, 2015, at 5:27 PM, rahul malviya <malviyarahul2...@gmail.com> wrote:
> 
> The heterogenity factor of my cluster is increasing every time we upgrade
> and its really hard to keep the same hardware config at every node.
> Handling this at configuration level will solve my problem.
> 
> Is this problem not faced by anyone else ?
> 
> Rahul
> 
> On Wed, Jun 17, 2015 at 5:22 PM, anil gupta <anilgupt...@gmail.com> wrote:
> 
>> Hi Rahul,
>> 
>> I dont think, there is anything like that.
>> But, you can effectively do that by setting Region size. However, if
>> hardware configuration varies across the cluster, then this property would
>> not be helpful because AFAIK, region size can be set on table basis
>> only(not on node basis). It would be best to avoid having diff in hardware
>> in cluster machines.
>> 
>> Thanks,
>> Anil Gupta
>> 
>> On Wed, Jun 17, 2015 at 5:12 PM, rahul malviya <malviyarahul2...@gmail.com
>>> 
>> wrote:
>> 
>>> Hi,
>>> 
>>> Is it possible to configure HBase to have only fix number of regions per
>>> node per table in hbase. For example node1 serves 2 regions, node2
>> serves 3
>>> regions etc for any table created ?
>>> 
>>> Thanks,
>>> Rahul
>>> 
>> 
>> 
>> 
>> --
>> Thanks & Regards,
>> Anil Gupta
>> 

The opinions expressed here are mine, while they may reflect a cognitive 
thought, that is purely accidental. 
Use at your own risk. 
Michael Segel
michael_segel (AT) hotmail.com

Re: Fix Number of Regions per Node ?

Reply via email to