@lars, 

How does the HDFS load balancer impact the load balancing of HBase? 

Of course there are two loads… one is the number of regions managed by a region 
server that’s HBase’s load, right? 
And then there’s the data distribution of HBase files that is really managed by 
HDFS load balancer, right? 

OP’s question is having a heterogenous cluster where he would like to see a 
more even distribution of data/free space based on the capacity of the newer 
machines in the cluster. 

This is a storage question, not a memory/cpu core question. 

Or am I missing something? 


-Mike

> On Mar 22, 2015, at 10:56 PM, lars hofhansl <[email protected]> wrote:
> 
> Seems that it should not be too hard to add that to the stochastic load 
> balancer.
> We could add a spaceCost or something.
> 
> 
> 
> ----- Original Message -----
> From: Jean-Marc Spaggiari <[email protected]>
> To: user <[email protected]>
> Cc: Development <[email protected]>
> Sent: Thursday, March 19, 2015 12:55 PM
> Subject: Re: introducing nodes w/ more storage
> 
> You can extend the default balancer and assign the regions based on
> that.But at the end, the replicated blocks might still go all over the
> cluster and your "small" nodes are going to be full and will not be able to
> get anymore writes even for the regions they are supposed to get.
> 
> I'm not sure there is a good solution for what you are looking for :(
> 
> I build my own balancer but because of differences in the CPUs, not because
> of differences of the storage space...
> 
> 
> 2015-03-19 15:50 GMT-04:00 Nick Dimiduk <[email protected]>:
> 
>> Seems more fantasy than fact, I'm afraid. The default load balancer [0]
>> takes store file size into account, but has no concept of capacity. It
>> doesn't know that nodes in a heterogenous environment have different
>> capacity.
>> 
>> This would be a good feature to add though.
>> 
>> [0]:
>> 
>> https://github.com/apache/hbase/blob/branch-1.0/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
>> 
>> On Tue, Mar 17, 2015 at 7:26 AM, Ted Tuttle <[email protected]> wrote:
>> 
>>> Hello-
>>> 
>>> Sometime back I asked a question about introducing new nodes w/ more
>>> storage that existing nodes.  I was told at the time that HBase will not
>> be
>>> able to utilize the additional storage; I assumed at the time that
>> regions
>>> are allocated to nodes in something like a round-robin fashion and the
>> node
>>> with the least storage sets the limit for how much each node can utilize.
>>> 
>>> My question this time around has to do with nodes w/ unequal numbers of
>>> volumes: Does HBase allocate regions based on nodes or volumes on the
>>> nodes?  I am hoping I can add a node with 8 volumes totaling 8X TB and
>> all
>>> the volumes will be filled.  This even though legacy nodes have 5 volumes
>>> and total storage of 5X TB.
>>> 
>>> Fact or fantasy?
>>> 
>>> Thanks,
>>> Ted
>>> 
>>> 
>> 
> 

The opinions expressed here are mine, while they may reflect a cognitive 
thought, that is purely accidental. 
Use at your own risk. 
Michael Segel
michael_segel (AT) hotmail.com





Reply via email to