[ 
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573153#comment-16573153
 ] 

Hari Sekhon edited comment on HBASE-21014 at 8/8/18 1:01 PM:
-------------------------------------------------------------

I thought that was the whole point of the FavoredNodeLoadBalancer  - so that 
HBase Balancer can write those HDFS hints based on the knowledge it has of 
region locations so that HDFS Balancer can read the preferred location hints 
and not move those blocks, therefore not losing data locality?

Normally I would just balance and then major compact but there are 2 issues 
with running major compaction:
 # performance impact - this cluster is production and heavily loaded
 # this cluster is already running around 70-80% full which combined with HBase 
rolling snapshots covering 4 days means that more than the one scheduled major 
compaction a week would cause space exhaustion resulting in an outage as the 
prior blocks are not removed (this very nearly happened the first time I ran 
major compaction on this cluster but I realised what was going on and took 
quick action to avoid an outage - annoyingly there is not yet a major 
compaction cancel command in this version of HBase so it couldn't just be 
stopped once started and ran for several hours)


was (Author: harisekhon):
I thought that was the whole point of the FavoredNodeLoadBalancer  - so that 
HBase Balancer can write those HDFS hints based on the knowledge it has of 
region locations so that HDFS Balancer can read the preferred location hints 
and not move those blocks, therefore not losing data locality?

Normally I would just balance and then major compact but there are 2 issues 
with running major compaction:
 # performance impact - this cluster is production and heavily loaded
 # this cluster is already running around 70-80% full which combined with HBase 
rolling snapshots covering 4 days means that more than the one scheduled major 
compaction a week would space exhaustion resulting in an outage as the prior 
blocks are not removed (this very nearly happened the first time I ran major 
compaction on this cluster but I realised what was going on and took quick 
action to avoid an outage - annoyingly there is not yet a major compaction 
cancel command in this version of HBase so it couldn't just be stopped once 
started and ran for several hours)

> Improve Stochastic Balancer to write HDFS favoured node hints for region 
> primary blocks to avoid destroying data locality if needing to use HDFS 
> Balancer
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-21014
>                 URL: https://issues.apache.org/jira/browse/HBASE-21014
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer
>    Affects Versions: 1.1.2
>            Reporter: Hari Sekhon
>            Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to 
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears 
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks 
> like this functionality is only within FavoredNodeBalancer and not the 
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work 
> that has gone in the Stochastic Balancer which I believe right now is the 
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to