If only inserts are perform you dont need major_compaction, but propably you need BLOOMFILTER=ROW, smaller BLOCKSIZE(?), COMPRESSION lzo or snappy, probably smaller region hbase.hregion.max.filesize
its only my $0.02 W dniu 26.10.2011 22:39, Vladimir Rodionov pisze: > > > On Wed, Oct 26, 2011 at 12:51 PM, Vladimir Rodionov > <[email protected]> wrote: >> We have a reporting tool which runs queries against Oracle DB, collects fact >> ids and then >> queries HBase for these facts (one-by-one). This is single thread, simple >> Get op >> >> Can you do concurrent gets? > Yes, > >> Whats your hardware like. How many disks per machine? > This is on our customer premises. I suppose - not less than 6 > >> Is the table major compacted? > Really doubt of that but I do not have direct access to the grid environment > >> Are you hitting cache at all? > Its totally random, due to the proposed key design which favored fast > inserts. Keys are randomized > values, that is why there is no data locality in row look ups. Effect of the > cache (LruBlockCache?) is negligible > in this case. > >> It is slow, of course. 5 hours to retrieve 1M facts from HBase storage. >> Approx 55 rows per sec >> > How big is your table? > > Not too big yet (50M rows) each row is approx 1-2K but its getting every day. > > > St.Ack >
