Thanks Ryan.

Got the point on 64k hbase block size. Can you add more on negative impact
from smaller HDFS block sizes?

Larger HDFS blocks are great for batch ops, for random reads isn't making
HDFS block size closer to HBASE block will help; so any block cache miss
fetches around 64k rather than 64mb  (dfs default block size)

Best,
Abhi


On Wed, Sep 8, 2010 at 4:42 PM, Ryan Rawson <[email protected]> wrote:

> The HFile block size is the minimal size of data reads from HDFS.
> Thus it should be smaller, but not too small.
>
> There is a definite negative impact from smaller HDFS block sizes on
> performance, so that should be larger.
>
> Therefore one should not try to match them.  If they 'met in the
> middle' no one would be happy :-)
>
> We did tests, and the 64k default block size works well, I wouldn't
> tweak it unless you have specific evidence that your new settings is
> better.
>
> -ryan
>
> On Wed, Sep 8, 2010 at 4:36 PM, Abhijit Pol <[email protected]> wrote:
> > On HDFS side we have "dfs.block.size" set to 128MB (like our analytic
> hadoop
> > cluster) and HBASE side "hfile.min.blocksize.size" is defaulted to 64KB.
> >
> > How these parameter play a role in HBASE read and writes? One should try
> to
> > match them?
> >
> > Thanks,
> > Abhi
> >
>

Reply via email to