RE: HBase Architecture question

Jim Kellerman (POWERSET) Mon, 04 May 2009 15:13:42 -0700

Tim,

In the future, I would encourage you to ask a question on the hbase-user
mailing list. 
(for how to sign up, see http://hadoop.apache.org/hbase/mailing_lists.html)


I have been out of town since Tuesday April 28, and am just starting to
dig through my email. Had you asked on the list you would have received
a more timely response, as the developers and most of the users follow
the list and you would not have to depend on me being around.

However, to answer your question:
 
1) row keys in the root region are a combination of the META table name
   and the first row key in the meta region. The client manufactures a
   row key which is composed of the meta table name, the user table name
   and the row being requested. The client finds the first meta region
   before that key in order to determine which meta region will contain
   the key for the user region. The client then contacts the region server
   serving that meta region so it can find which region server is hosting
   the desired user region.

2) The row key in the META region(s) is the name of the table combined with
   the start key for each region. Once the client has located the correct
   meta region (see above) it manufactures a key based on the table name
   and the row being requested. It finds the first row before that key,
   as region names are composed of the table name and the start key for
   that region, so the requested row will be in that region.
   It can then retrieve the region server address for the region server
   serving that region (and hence the row requested) and will contact
   that region server directly to fetch or update the requested row.

There are only a few users who have had enough regions so that there
will be more than one meta region. 

An empty start key means that the region contains keys from the start
of the table up to the end key.

An empty end key means that the region contains keys from the start
key to the end of the table.

If both the start key and the end key are empty, this means that a
single region contains all the rows for that table.


I would encourage you to read the HBase architecture document at:
http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture

I would also encourage you to read the Bigtable paper at:
http://labs.google.com/papers/bigtable.html

---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)


> -----Original Message-----
> From: Timothy Nolen [mailto:[email protected]] On Behalf Of Tim Nolen
> Sent: Thursday, April 30, 2009 6:46 PM
> To: [email protected]
> Subject: HBase Architecture question
> 
> Mr. Kellerman,
> I was wondering if you could help me understand the ROOT -> META ->
> HRegionServer search path. I understand that the client first
> contacts
> the HBaseMaster to find the location of the ROOT region. It then
> scans
> the ROOT region for the location of the META region containing the
> range
> of data it is requesting. But since META is split among regions, how
> does it know which HRegionServer holds the data it is requesting?
> 
> I've done scans of the -ROOT- and .META. tables, but it hasn't
> helped me
> much because none of the tables that are in the system seem to have
> splits--their STARTKEY and ENDKEY are both ''.
> 
> Thanks,
> Tim Nolen

RE: HBase Architecture question

Reply via email to