On Wed, Oct 26, 2011 at 1:39 PM, Vladimir Rodionov <[email protected]> wrote: >> Can you do concurrent gets? > > Yes, >
This should make a difference. >> Whats your hardware like. How many disks per machine? > > This is on our customer premises. I suppose - not less than 6 > More disks, more i/o. >> Is the table major compacted? > > Really doubt of that but I do not have direct access to the grid environment > It can make a difference. These fellas know how to run hbase or you have to do it all for them. >> Are you hitting cache at all? > > Its totally random, due to the proposed key design which favored fast > inserts. Keys are randomized > values, that is why there is no data locality in row look ups. Effect of the > cache (LruBlockCache?) is negligible > in this case. > So a different schema would get cache into the mix? You are doing totally random keys just so you can do distribute inserts? This is time series? >> It is slow, of course. 5 hours to retrieve 1M facts from HBase storage. >> Approx 55 rows per sec >> > > How big is your table? > > Not too big yet (50M rows) each row is approx 1-2K but its getting every day. > Its going to keep growing without bound? St.Ack
