[
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988269#action_12988269
]
Jonathan Gray commented on HBASE-3373:
--------------------------------------
Both of your solutions are rather specialized and I'm not sure generally
applicable. I would much prefer spending effort on improving our current load
balancer and it seems to me that it would be possible to implement similar
behaviors in a more generalized way.
Also, the addition of an HBaseAdmin region move API makes it so you don't need
to muck with HBase server code to do specialized balancing logic. With the
current APIs, it's possible to basically push the balancer out into your own
client.
@Matt, I don't think I'm really understanding how you upgrade our load balancer
w/ consistent hashing?
The fact that split regions open back up on the same server is actually an
optimization in many cases because it reduces the amount of time the regions
are offline and when they come back online and do a compaction to drop
references, all the files are more likely to be on the local DataNode rather
than remote. In some cases, like time-series, you may want the splits to move
to different servers. I could imagine some configurable logic in there to
ensure the bottom half goes to a different server (or maybe the top half would
actually be more efficient to move away since most the time you'll write more
to the bottom half and thus want the data locality / quick turnaround).
There's likely going to be a bit of split rework in 0.92 to make it more like
the ZK-based regions-in-transition.
As far as binding regions to servers between cluster restarts, this is already
implemented and on by default in 0.90.
Consistent hashing also requires a fixed keyspace (right?) and that's a
mismatch for HBase's flexibility in this regard.
Do you have any code for this client-side consistent hashing balancer? I'm
confused about how that could be implemented without knowing a lot about your
data, the regions, the servers available, etc.
> Allow regions of specific table to be load-balanced
> ---------------------------------------------------
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
> Issue Type: Improvement
> Components: master
> Affects Versions: 0.20.6
> Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> From our experience, cluster can be well balanced and yet, one table's
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified
> tables evenly across the cluster. Each of such tables has number of regions
> many times the cluster size.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.