Each cell in HBase has a key which is a tuple consisting of row-key column-family:column-member and timestamp
Large tables are broken into row ranges called regions. All the members of a single column family are stored together in a file. Thus the row key is used to find the region and the column family is used to find the the file. Each file has a sparse index composed of the row/column/timestamp keys, so finding a particular cell involves binary searching the index (which is kept in memory). --- Jim Kellerman, Senior Engineer; Powerset > -----Original Message----- > From: Bin YANG [mailto:[EMAIL PROTECTED] > Sent: Thursday, March 06, 2008 12:40 AM > To: [EMAIL PROTECTED] > Subject: Does HBase have a index? > > Dear colleagues, > > I have a questions on HBase's index implementation. > > How does the HBase find the data according to a row key? Use > a index like database, or use a hash function? > I suppose that a hash function which hash row key to physical > address is more efficient. > > As we know, a big table in HBase is stored as several Small > tables, each table stores attributes in a column family. > So that, each row may be stored in several small tables. > Does a hash function hash row key to many physical address? > Each physical address correspond to a small table which > contains the row key? > > Does anybody have idea on how to create a index on other attribute? > > Best, > Bin YANG > -- > Bin YANG > Department of Computer Science and Engineering Fudan > University Shanghai, P. R. China > EMail: [EMAIL PROTECTED] > > No virus found in this incoming message. > Checked by AVG Free Edition. > Version: 7.5.516 / Virus Database: 269.21.4/1313 - Release > Date: 3/5/2008 9:50 AM > > No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.516 / Virus Database: 269.21.4/1313 - Release Date: 3/5/2008 9:50 AM
