Tim, In the future, I would encourage you to ask a question on the hbase-user mailing list. (for how to sign up, see http://hadoop.apache.org/hbase/mailing_lists.html)
I have been out of town since Tuesday April 28, and am just starting to dig through my email. Had you asked on the list you would have received a more timely response, as the developers and most of the users follow the list and you would not have to depend on me being around. However, to answer your question: 1) row keys in the root region are a combination of the META table name and the first row key in the meta region. The client manufactures a row key which is composed of the meta table name, the user table name and the row being requested. The client finds the first meta region before that key in order to determine which meta region will contain the key for the user region. The client then contacts the region server serving that meta region so it can find which region server is hosting the desired user region. 2) The row key in the META region(s) is the name of the table combined with the start key for each region. Once the client has located the correct meta region (see above) it manufactures a key based on the table name and the row being requested. It finds the first row before that key, as region names are composed of the table name and the start key for that region, so the requested row will be in that region. It can then retrieve the region server address for the region server serving that region (and hence the row requested) and will contact that region server directly to fetch or update the requested row. There are only a few users who have had enough regions so that there will be more than one meta region. An empty start key means that the region contains keys from the start of the table up to the end key. An empty end key means that the region contains keys from the start key to the end of the table. If both the start key and the end key are empty, this means that a single region contains all the rows for that table. I would encourage you to read the HBase architecture document at: http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture I would also encourage you to read the Bigtable paper at: http://labs.google.com/papers/bigtable.html --- Jim Kellerman, Powerset (Live Search, Microsoft Corporation) > -----Original Message----- > From: Timothy Nolen [mailto:[email protected]] On Behalf Of Tim Nolen > Sent: Thursday, April 30, 2009 6:46 PM > To: [email protected] > Subject: HBase Architecture question > > Mr. Kellerman, > I was wondering if you could help me understand the ROOT -> META -> > HRegionServer search path. I understand that the client first > contacts > the HBaseMaster to find the location of the ROOT region. It then > scans > the ROOT region for the location of the META region containing the > range > of data it is requesting. But since META is split among regions, how > does it know which HRegionServer holds the data it is requesting? > > I've done scans of the -ROOT- and .META. tables, but it hasn't > helped me > much because none of the tables that are in the system seem to have > splits--their STARTKEY and ENDKEY are both ''. > > Thanks, > Tim Nolen
