HDFS blocks are just units of distribution they don't affect how much data is read; that is to say that reading one HBase block doesn't necessitate reading the whole 128Mb HDFS block. So if a read (Get for example) comes in to a cold cache HBase Regionserver, the regionserver will read a few index blocks (~3 index blocks) and one data block. All of these blocks will be put into the block cache so any subsequent reads won't repeat the reads. If the data isn't present it's also possible to use the bloom filters to tell that it's absent so the data block wouldn't even need to be read.
On Wed, May 28, 2014 at 3:49 PM, 박정현 <[email protected]> wrote: > Dear All. > > > > I am Junghyun Park in Korea > > I am studying Hbase which is very interesting > > I have two questions about dealing with a block > > > > hadoop block size : 128M > hbase block size : 64k > > > > *question1 :* > > If USER_TABLE on hbase is 128M size, Hadoop stores one block > > When hbase get one row using rowkey (get 'USER_TABLE','row-001') Maybe, > hadoop get one block(128M) > > Do we get 128M size block for reading just one row ?? > > > > > > *question2 :* > > If USER_TABLE on hbase is 1280M size, Hadoop stores 10 blocks > > When hbase get one row using rowkey (get 'USER_TABLE','row-001') > > Maybe, hadoop get one block(128M) or one more blocks (I think one block) > > What is how to find a hadoop block included 'row-001' ?? > > > > > > > > Thank you for your time > > yours faithfully > > > > > > > > > >
