[
https://issues.apache.org/jira/browse/HBASE-17707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888745#comment-15888745
]
Hadoop QA commented on HBASE-17707:
-----------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} docker {color} | {color:red} 2m 2s {color}
| {color:red} Docker failed to build yetus/hbase:8d52d23. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12855200/HBASE-17707-01.patch |
| JIRA Issue | HBASE-17707 |
| Console output |
https://builds.apache.org/job/PreCommit-HBASE-Build/5879/console |
| Powered by | Apache Yetus 0.3.0 http://yetus.apache.org |
This message was automatically generated.
> New More Accurate TableSkew Balancer/Generator
> ----------------------------------------------
>
> Key: HBASE-17707
> URL: https://issues.apache.org/jira/browse/HBASE-17707
> Project: HBase
> Issue Type: New Feature
> Components: Balancer
> Affects Versions: 1.2.0
> Environment: CentOS Derivative with a derivative of the 3.18.43
> kernel. HBase on CDH5.9.0 with some patches. HDFS CDH 5.9.0 with no patches.
> Reporter: Kahlil Oppenheimer
> Priority: Minor
> Labels: patch
> Attachments: HBASE-17707-00.patch, HBASE-17707-01.patch
>
>
> This patch includes new version of the TableSkewCostFunction and a new
> TableSkewCandidateGenerator.
> The new TableSkewCostFunction computes table skew by counting the minimal
> number of region moves required for a given table to perfectly balance the
> table across the cluster (i.e. as if the regions from that table had been
> round-robin-ed across the cluster). This number of moves is computer for each
> table, then normalized to a score between 0-1 by dividing by the number of
> moves required in the absolute worst case (i.e. the entire table is stored on
> one server), and stored in an array. The cost function then takes a weighted
> average of the average and maximum value across all tables. The weights in
> this average are configurable to allow for certain users to more strongly
> penalize situations where one table is skewed versus where every table is a
> little bit skewed. To better spread this value more evenly across the range
> 0-1, we take the square root of the weighted average to get the final value.
> The new TableSkewCandidateGenerator generates region moves/swaps to optimize
> the above TableSkewCostFunction. It first simply tries to move regions until
> each server has the right number of regions, then it swaps regions around
> such that each region swap improves table skew across the cluster.
> We tested the cost function and generator in our production clusters with
> 100s of TBs of data and 100s of tables across dozens of servers and found
> both to be very performant and accurate.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)