Actually I don't think this is the problem as HBase versions cells, not rows, if I understand correctly.
On Dec 22, 2010, at 5:03 PM, Bryan Keller wrote: > Perhaps slow wide table insert performance is related to row versioning? If I > have a customer row and keep adding order columns one by one, I'm thinking > that there might be a version kept of the row for every order I add? If I am > simply inserting a new row for every order, there is no versioning going on. > Could this be causing performance problems? > > On Dec 22, 2010, at 4:16 PM, Bryan Keller wrote: > >> It appears to be the same or better, not to derail my original question. The >> much slower write performance will cause problems for me unless I can >> resolve that. >> >> On Dec 22, 2010, at 3:52 PM, Peter Haidinyak wrote: >> >>> Interesting, do you know what the time difference would be on the other >>> side, doing a lookup/scan? >>> >>> Thanks >>> >>> -Pete >>> >>> -----Original Message----- >>> From: Bryan Keller [mailto:[email protected]] >>> Sent: Wednesday, December 22, 2010 3:41 PM >>> To: [email protected] >>> Subject: Insert into tall table 50% faster than wide table >>> >>> I have been testing a couple of different approaches to storing customer >>> orders. One is a tall table, where each order is a row. The other is a wide >>> table where each customer is a row, and orders are columns in the row. I am >>> finding that inserts into the tall table, i.e. adding rows for every order, >>> is roughly 50% faster than inserts into the wide table, i.e. adding a row >>> for a customer and then adding columns for orders. >>> >>> In my test, there are 10,000 customers, each customer has 600 orders and >>> each order has 10 columns. The tall table approach results in 6 mil rows of >>> 10 columns. The wide table approach results is 10,000 rows of 6,000 >>> columns. I'm using hbase 0.89-20100924 and hadoop 0.20.2. I am adding the >>> orders using a Put for each order, submitted in batches of 1000 as a list >>> of Puts. >>> >>> Are there techniques to speed up inserts with the wide table approach that >>> I am perhaps overlooking? >>> >> >
