On Wed, Oct 26, 2011 at 12:51 PM, Vladimir Rodionov
<[email protected]> wrote:
> We have a reporting tool which runs queries against Oracle DB, collects fact
> ids and then
> queries HBase for these facts (one-by-one). This is single thread, simple Get
> op
>
> Can you do concurrent gets?
Yes,
> Whats your hardware like. How many disks per machine?
This is on our customer premises. I suppose - not less than 6
> Is the table major compacted?
Really doubt of that but I do not have direct access to the grid environment
> Are you hitting cache at all?
Its totally random, due to the proposed key design which favored fast inserts.
Keys are randomized
values, that is why there is no data locality in row look ups. Effect of the
cache (LruBlockCache?) is negligible
in this case.
> It is slow, of course. 5 hours to retrieve 1M facts from HBase storage.
> Approx 55 rows per sec
>
How big is your table?
Not too big yet (50M rows) each row is approx 1-2K but its getting every day.
St.Ack