Hi geoff, Since hbase balances not at table but at cluster basis it may happen that all the regions for one table are located at the same region server. The reason for this may be the way hbase does table splits. If a region exceeds the configured maximum size the region is split into two, but both resulting regions are on the same region server. In the background there is a load balancer running that copies regions between region servers in case that the cluster gets unbalanced. The region that is copied does not necessarily belong to the table that you are actually creating. It seems to me that the regions copied for balancing in most cases do not belong to the table that is currently created but there is a simple workaround, if you want a balancing at table level you should disable all the other tables when generating a the new table.
regards Christian ------------------8<-------------------------- Siemens AG Corporate Technology Corporate Research and Technologies CT T DE IT3 Otto-Hahn-Ring 6 81739 München, Deutschland Tel.: +49 (89) 636-42722 Fax: +49 (89) 636-41423 mailto:[email protected] Siemens Aktiengesellschaft: Vorsitzender des Aufsichtsrats: Gerhard Cromme; Vorstand: Peter Löscher, Vorsitzender; Roland Busch, Brigitte Ederer, Klaus Helmrich, Joe Kaeser, Barbara Kux, Hermann Requardt, Siegfried Russwurm, Peter Y. Solmssen, Michael Süß; Sitz der Gesellschaft: Berlin und München, Deutschland; Registergericht: Berlin Charlottenburg, HRB 12300, München, HRB 6684; WEEE-Reg.-Nr. DE 23691322 -----Original Message----- From: Geoff Hendrey [mailto:[email protected]] Sent: Tuesday, June 07, 2011 8:34 PM To: [email protected] Subject: distribution of regions to servers I have a table with a hundred or so regions. When I look in the hbase web ui, I see that all the regions are on one server. Of course we have many other tables and lots of data. Some tables seem to distribute their regions amongst many servers. I know there probably isn't a "pat" answer to this but: wouldn't I want a large table with many regions to be distributed across many machines? Just curious to understand the nuance. If I *do* want a uniform distribution of regions to servers, how would I achieve it? -geoff
