But the tall table is FASTER than the wide table. On Wed, Dec 22, 2010 at 11:14 PM, Andrey Stepachev <[email protected]> wrote:
> I think row locks slows down here. Each row you inserted tries to aquire > lock, and then release it. Wide table has significally less rows, and much > less locks acquired during insert. > > > 2010/12/23 Bryan Keller <[email protected]> > > > I have been testing a couple of different approaches to storing customer > > orders. One is a tall table, where each order is a row. The other is a > wide > > table where each customer is a row, and orders are columns in the row. I > am > > finding that inserts into the tall table, i.e. adding rows for every > order, > > is roughly 50% faster than inserts into the wide table, i.e. adding a row > > for a customer and then adding columns for orders. > > > > In my test, there are 10,000 customers, each customer has 600 orders and > > each order has 10 columns. The tall table approach results in 6 mil rows > of > > 10 columns. The wide table approach results is 10,000 rows of 6,000 > columns. > > I'm using hbase 0.89-20100924 and hadoop 0.20.2. I am adding the orders > > using a Put for each order, submitted in batches of 1000 as a list of > Puts. > > > > Are there techniques to speed up inserts with the wide table approach > that > > I am perhaps overlooking? > > > > >
