Re: Slow Get Performance (or how many disk I/O does it take for one non-cached read?)

Ted Yu Fri, 31 Jan 2014 21:38:44 -0800

I realized that after hitting Send button :-)

And 0.94.17 is around the corner, right ?



On Fri, Jan 31, 2014 at 9:27 PM, lars hofhansl <[email protected]> wrote:

> 0.94.16 is out already :)
>
>
>
> ----- Original Message -----
> From: Ted Yu <[email protected]>
> To: "[email protected]" <[email protected]>
> Cc:
> Sent: Friday, January 31, 2014 8:28 PM
> Subject: Re: Slow Get Performance (or how many disk I/O does it take for
> one non-cached read?)
>
> For #4,
> bq. has this shortcut enabled by default
>
> Inline checksum is different from short circuit read. Inline checksum is
> enabled by default in 0.96 and later releases - see HBASE-8322
>
> Meanwhile, you can consider upgrading to 0.94.15 - there have been quite
> some improvements since 0.94.6
>
> Cheers
>
>
>
> On Fri, Jan 31, 2014 at 6:38 PM, Jan Schellenberger <[email protected]
> >wrote:
>
> > Thank you.  I will have to test these things one at a time.
> >
> > I re-enabled compression (SNAPPY for now) and changed the block encoding
> to
> > FAST_DIFF.
> >
> > #1 I will try GZ encoding.
> > #2 The block cache size is already at .4. I will try to increase it a bit
> > more but I will never get the whole set into memory.
> > I will disable bloom filter.
> >
> > #4 I will investigate this.  I thought I read somewhere that cloudera 4.3
> > has this shortcut enabled by default but I will try to verify.
> >
> > #3 I'm not sure I understand this suggestion - are you saying doing
> region
> > custom region splitting?  Each region is fully compacted so there is only
> > one HFile.  The queries I do are: "get me the most recent versions, up to
> > 200".  However I need to store more versions, because I may ask "get me
> the
> > most recent versions, up to 200 that I would have seen yesterday".
> >
> >
> > #5 HDFS short circuit is already enabled already by default.
> > #6 yes SSD would clearly be better.
> >
> > #7 The average result of the get is fairly small.  no more than 1kB I'd
> > say.
> > We do hit each key with roughly the same probability.
> >
> >
> >
> > I'm concerned about the block cache... It sounds like the improper blocks
> > are being cached.  i thought there was a preference to cache index and
> > bloom
> > blocks.
> >
> > I'm currently* running 60 queries/second* one node and it's reading
> > blockCacheHitRatio=29 and blockCacheHitCachingRatio=65% (not sure what's
> > the
> > difference).
> >
> > I also see rootIndexSize=122k totalStaticIndexSize=88MB and
> > totalstaticBloomSize=80MB (will disable bloomfilters in next run of
> this).
> > hdfslocality=100%
> >
> >
> >
> >
> >
> > --
> > View this message in context:
> >
> http://apache-hbase.679495.n3.nabble.com/Slow-Get-Performance-or-how-many-disk-I-O-does-it-take-for-one-non-cached-read-tp4055545p4055554.html
> > Sent from the HBase User mailing list archive at Nabble.com.
> >
>
>

Re: Slow Get Performance (or how many disk I/O does it take for one non-cached read?)

Reply via email to