Right now I have 4GB of heap per regionserver, and as Stack suggested, I
have set hfile.block.cache.size to 0.5.
At the moment of doing Gets there's nothing more running that would affect
performance. Cells are very small - they contain 1 integer and this table
has about 20M rows, it spans over 4 regionservers, so I have about 64
regions, each is 256MB.

I use RAID, but this will be changed soon, but I takes time (we're moving to
new servers).

I didn't notice any improvement after changing option
hfile.block.cache.size, I don't know if this i relevant, but in my testing
job I do at most only one Get per row (before querying HBase I do DISTINCT).

Stats from cache reads are here: http://pastebin.com/BmmL09dK
This is after restarting servers, and during running first job.

Thanks for helping me.


2010/11/3 Stack <[email protected]>

> On Wed, Nov 3, 2010 at 7:15 AM, Wojciech Langiewicz
> <[email protected]> wrote:
> >
> > I'm running latest version from Cloudera
>
>
> Try a later version of the 0.89 series.  See the downloads page on our
> site.   It has perf. improvements.
>
>
> >> Each KV is a distinct Put operation?  Normally people get high
> throughput
> >> by batching many Puts at once.
> >>
> >
> > Actually, here I'm asking about Get operations, because I don't know how
> to
> > batch them (by design). But in case of Puts you are right.
> >
>
> There is a batch Get in TRUNK that should be available as 0.90.0RC0 soon.
>
> > I'm rather asking what can I expect from my schema design and hardware by
> > comparing other people solutions, right now I'm getting 10 times less
> > performance that I initially wanted.
> >
> >
> Well, if going to disk, reading we're talking 10-30ms a hit.  If you
> are reading from cache, you should see 5ms and less.  Try upping
> proportion of your heap given over to block cache; set
> hfile.block.cache.size to 0.4 or 0.5 of heap (Writes should be going
> in pretty fast -- ~5m or less).
>
> What size your cells?  How many regions in your table?   How much RAM
> have you given over to HBase?   Anything else running on these
> machines?  You doing any wacky RAID'ing on those disks?
>
> Good luck,
> St.Ack
>



-- 
Wojciech Langiewicz

Reply via email to