Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The following page has been changed by EvgenyRyabitskiy:
http://wiki.apache.org/hadoop/Hbase/DesignOverview

------------------------------------------------------------------------------
  
  An extension was added recently to allow multi-row locking, but this is not 
the default behavior and must be explicitly enabled.
  
- More details are here [:Hbase/DataModel: The HBase/Bigtable Data Model]
+ More details are here [:Hbase/DataModel: The HBase Data Model]
  
  [[Anchor(conceptual)]]
  == Conceptual View ==
  
+ Conceptually a table may be thought of a collection of rows that are located 
by a row key (and optional timestamp) and where any column
+ may not have a value for a particular row key (sparse).
- Conceptually a table may be thought of a collection of rows that
- are located by a row key (and optional timestamp) and where any column
- may not have a value for a particular row key (sparse). The following example 
is a slightly modified form of the one on page 2 of the 
[http://labs.google.com/papers/bigtable.html Bigtable Paper] (adds a new column 
family ''"mime:"'').
  
  [[Anchor(datamodelexample)]]
  ||<:> '''Row Key''' ||<:> '''Time Stamp''' ||<:> '''Column''' ''"contents:"'' 
||||<:> '''Column''' ''"anchor:"'' ||<:> '''Column''' ''"mime:"'' ||
@@ -82, +81 @@

  
  === Row Ranges: Regions ===
  
- To an application, a table appears to be a list of tuples sorted by row key 
ascending, column name ascending and timestamp descending.  Physically, tables 
are broken up into row ranges called ''regions'' (equivalent Bigtable term is 
''tablet''). Each row range contains rows from start-key (inclusive) to end-key 
(exclusive). A set of regions, sorted appropriately, forms an entire table. 
Unlike Bigtable which identifies a row range by the table name and end-key, 
HBase identifies a row range by the table name and start-key.
+ To an application, a table appears to be a list of tuples sorted by row key 
ascending, column name ascending and timestamp descending.  Physically, tables 
are broken up into row ranges called ''regions''. Each row range contains rows 
from start-key (inclusive) to end-key (exclusive). A set of regions, sorted 
appropriately, forms an entire table. Row range identified by the table name 
and start-key.
  
- Each column family in a region is managed by an ''HStore''. Each HStore may 
have one or more ''!MapFiles'' (a Hadoop HDFS file type) that is very similar 
to a Google ''SSTable''. Like SSTables, !MapFiles are immutable once closed. 
!MapFiles are stored in the Hadoop HDFS. Other details are the same, except:
+ Each column family in a region is managed by an ''Store''. Each ''Store'' may 
have one or more ''!StoreFiles'' (a Hadoop HDFS file type). !StoreFilesare 
immutable once closed. !StoreFilesare stored in the Hadoop HDFS. Other details 
are the same, except:
-  * !MapFiles cannot currently be mapped into memory.
+  * !StoreFiles cannot currently be mapped into memory.
-  * !MapFiles maintain the sparse index in a separate file rather than at the 
end of the file as SSTable does.
+  * !StoreFiles maintain the sparse index in a separate file rather than at 
the end of the file as SSTable does.
-  * HBase extends !MapFile so that a bloom filter can be employed to enhance 
negative lookup performance. The hash function employed is one developed by Bob 
Jenkins.
+  * HBase extends !StoreFiles so that a bloom filter can be employed to 
enhance negative lookup performance. The hash function employed is one 
developed by Bob Jenkins.
  
  [[Anchor(arch)]]
  = Architecture and Implementation =
  
  There are three major components of the HBase architecture:
-  1. The H!BaseMaster (analogous to the Bigtable master server)
+  1. The H!BaseMaster (HBase master server)
-  2. The H!RegionServer (analogous to the Bigtable tablet server)
+  2. The H!RegionServer (HBase region server)
   3. The HBase client, defined by org.apache.hadoop.hbase.client.HTable
  
  Each will be discussed in the following sections.

Reply via email to