[
https://issues.apache.org/jira/browse/HBASE-21014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16573153#comment-16573153
]
Hari Sekhon edited comment on HBASE-21014 at 8/8/18 1:01 PM:
-------------------------------------------------------------
I thought that was the whole point of the FavoredNodeLoadBalancer - so that
HBase Balancer can write those HDFS hints based on the knowledge it has of
region locations so that HDFS Balancer can read the preferred location hints
and not move those blocks, therefore not losing data locality?
Normally I would just balance and then major compact but there are 2 issues
with running major compaction:
# performance impact - this cluster is production and heavily loaded
# this cluster is already running around 70-80% full which combined with HBase
rolling snapshots covering 4 days means that more than the one scheduled major
compaction a week would cause space exhaustion resulting in an outage as the
prior blocks are not removed (this very nearly happened the first time I ran
major compaction on this cluster but I realised what was going on and took
quick action to avoid an outage - annoyingly there is not yet a major
compaction cancel command in this version of HBase so it couldn't just be
stopped once started and ran for several hours)
was (Author: harisekhon):
I thought that was the whole point of the FavoredNodeLoadBalancer - so that
HBase Balancer can write those HDFS hints based on the knowledge it has of
region locations so that HDFS Balancer can read the preferred location hints
and not move those blocks, therefore not losing data locality?
Normally I would just balance and then major compact but there are 2 issues
with running major compaction:
# performance impact - this cluster is production and heavily loaded
# this cluster is already running around 70-80% full which combined with HBase
rolling snapshots covering 4 days means that more than the one scheduled major
compaction a week would space exhaustion resulting in an outage as the prior
blocks are not removed (this very nearly happened the first time I ran major
compaction on this cluster but I realised what was going on and took quick
action to avoid an outage - annoyingly there is not yet a major compaction
cancel command in this version of HBase so it couldn't just be stopped once
started and ran for several hours)
> Improve Stochastic Balancer to write HDFS favoured node hints for region
> primary blocks to avoid destroying data locality if needing to use HDFS
> Balancer
> ---------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: HBASE-21014
> URL: https://issues.apache.org/jira/browse/HBASE-21014
> Project: HBase
> Issue Type: Improvement
> Components: Balancer
> Affects Versions: 1.1.2
> Reporter: Hari Sekhon
> Priority: Major
>
> Improve Stochastic Balancer to include the HDFS region location hints to
> avoid HDFS Balancer destroying data locality.
> Right now according to a mix of docs, jiras and mailing list info it appears
> that one must change
> {code:java}
> hbase.master.loadbalancer.class{code}
> to the org.apache.hadoop.hbase.favored.FavoredNodeLoadBalancer as it looks
> like this functionality is only within FavoredNodeBalancer and not the
> standard Stochastic Balancer.
> [http://hbase.apache.org/book.html#_hbase_and_hdfs]
> This is not ideal because we'd still like to use all the heuristics and work
> that has gone in the Stochastic Balancer which I believe right now is the
> best and most mature HBase balancer.
> See also the linked Jiras and this discussion:
> [http://apache-hbase.679495.n3.nabble.com/HDFS-Balancer-td4086607.html]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)