Fixing badly distributed table manually.

David Koch Tue, 04 Sep 2012 07:56:28 -0700

Hello,

A couple of questions regarding balancing of a table's data in HBase.


a) What is the easiest way to get an overview of how a table is distributed
across regions of a cluster? I guess I could search .META. but I haven't
figured out how to use filters from shell.
b) What constitutes a "badly distributed" table and how can I re-balance
manually?
c) Is b) needed at all? I know that HBase does its balancing automatically
behind the scenes.

As for a) I tried running this script:

https://github.com/Mendeley/hbase-scripts/blob/master/list_regions.rb

like so:

hbase org.jruby.Main ./list_regions.rb <_my_table>

but I get

ArgumentError: wrong number of arguments (1 for 2)
  (root) at ./list_regions.rb:60

If someone more proficient notices an obvious fix, I'd be glad to hear
about it.

Why do I ask? I have the impression that one of the tables on our HBase
cluster is not well distributed. When running a Map Reduce job on this
table, the load average on a single node is very high, whereas all other
nodes are almost idling. It is the only table where this behavior is
observed. Other Map Reduce jobs result in slightly elevated load averages
on several machines.

Thank you,

/David

Fixing badly distributed table manually.

Reply via email to