think the storage of HBase as a INDEX, according to the KEY. So if the query has a condition of the key(for example, key = '1234'). It is a quick search tree (with meta data in memory), and locate the HFile. Then it is one I/O to retrieve the HFile.
'Quick' is a relative term. A relational database with Hash Index(not suitable for every schema) may work better. Demai On Thu, Jul 2, 2015 at 3:22 AM, James Teng <[email protected]> wrote: > Hi all,i am wondering whether someone could explain this to me, although i > have read some materials for hbase, but still have some puzzles on this. > Hbase store files are stored on hdfs, which lacks random access & writes, > although hbase has some mechanisms to filter data on store files level, but > still has to search wanted column data based on the left store files. when > get to this granularity, how can it quick position the columns, indexing > through the files which means having to read the whole data block on hdfs, > and leverage the refined keyvalue format to indexing data? isn't it on low > performance to read the whole store file? > thanks!james.
