2010/12/23 Ted Dunning <[email protected]> > But the tall table is FASTER than the wide table. >
Opps. :). Maybe you put more data? Do you using compression? (in case of prefixed qualifiers you get more data, that uuid can has comparable length as an order row) > > On Wed, Dec 22, 2010 at 11:14 PM, Andrey Stepachev <[email protected]> > wrote: > > > I think row locks slows down here. Each row you inserted tries to aquire > > lock, and then release it. Wide table has significally less rows, and > much > > less locks acquired during insert. > > > > > > 2010/12/23 Bryan Keller <[email protected]> > > > > > I have been testing a couple of different approaches to storing > customer > > > orders. One is a tall table, where each order is a row. The other is a > > wide > > > table where each customer is a row, and orders are columns in the row. > I > > am > > > finding that inserts into the tall table, i.e. adding rows for every > > order, > > > is roughly 50% faster than inserts into the wide table, i.e. adding a > row > > > for a customer and then adding columns for orders. > > > > > > In my test, there are 10,000 customers, each customer has 600 orders > and > > > each order has 10 columns. The tall table approach results in 6 mil > rows > > of > > > 10 columns. The wide table approach results is 10,000 rows of 6,000 > > columns. > > > I'm using hbase 0.89-20100924 and hadoop 0.20.2. I am adding the orders > > > using a Put for each order, submitted in batches of 1000 as a list of > > Puts. > > > > > > Are there techniques to speed up inserts with the wide table approach > > that > > > I am perhaps overlooking? > > > > > > > > >
