> If your customers data fits in one region, then no. For example, if > you have 1k tables then you would have 1k regions, those will be > distributed to the region servers but if any of them becomes a > hotspot, too bad. BTW a region is the basic unit of load distribution > in HBase.
I see. Usually a whole customer fits within a region. Actually, the number of customers that doesn't fit in a single region are only two or three. But then another question comes up. Even if a put all the data in a single table, given that the keys are written in order, and given that several customers can fit in the same region, I'd had the exact same problem right? I mean, if data from customer A to D sits in the same region within the same table, the result is worse than having 4 different tables, as those can actually sit in another region server right? Is there a way to move a region manually to another machine? > Client side? I don't believe so, there's almost nothing kept in memory. > > HBase side, 1k tables of 1 region is almost exactly like having 1 > table of 1k regions. A single region server, on good hardware (i7s, > more than 8GB of RAM, couple of disks), handles a couple of hundreds > regions (although it depends a lot on your usage patterns). Even if all the htables are opened at the same time? El vie, 30-07-2010 a las 09:19 -0700, Jean-Daniel Cryans escribió: > Inline. > > J-D > > 2010/7/30 Héctor Izquierdo Seliva <[email protected]>: > > Hi everyone. > > > > We are modeling our data based on one table per customer, so it's easy > > to drop one or add one without having to put offline while altering the > > table. > > > > I have to questions regarding this model: > > > > a) Our customers data size is very uneven. Some have millions of rows, > > some just a few thousands. This model of one table per customer will > > give good load balancing? > > If your customers data fits in one region, then no. For example, if > you have 1k tables then you would have 1k regions, those will be > distributed to the region servers but if any of them becomes a > hotspot, too bad. BTW a region is the basic unit of load distribution > in HBase. > > > > > b) Won't it be too heavy on resources to have thousands of htables > > opened? > > Client side? I don't believe so, there's almost nothing kept in memory. > > HBase side, 1k tables of 1 region is almost exactly like having 1 > table of 1k regions. A single region server, on good hardware (i7s, > more than 8GB of RAM, couple of disks), handles a couple of hundreds > regions (although it depends a lot on your usage patterns).
