[Hadoop Wiki] Update of "Hbase/HbaseArchitecture" by izaakrubin

Apache Wiki Fri, 22 Aug 2008 14:18:38 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The following page has been changed by izaakrubin:
http://wiki.apache.org/hadoop/Hbase/HbaseArchitecture

------------------------------------------------------------------------------
= Current Status =

As of this writing (2008/08/22), there are approximately 79,500 lines of code
in
- "src/contrib/hbase/src/java/org/apache/hadoop/hbase/" directory on the Hadoop
SVN trunk.
+ "hbase/trunk/src/java/org/apache/hadoop/hbase/" directory on the Hadoop SVN
trunk.

Also included in this directory are approximately 14,000 lines of test cases.

Issues and TODOs:
1. Modify RPC and amend how hbase uses RPC. In particular, some messages
will use the current RPC mechanism and others will use raw sockets. Raw
sockets will be used for sending the data of some serialized objects; the idea
is that we can hard-code the creation/interpretation of the serialized data and
avoid introspection.
- 1. Vuk Ercegovac [[MailTo(vercego AT SPAMFREE us DOT ibm DOT com)]] of IBM
Almaden Research pointed out that keeping HBase HRegion edit logs in HDFS is
currently flawed. HBase writes edits to logs and to a memcache. The 'atomic'
write to the log is meant to serve as insurance against abnormal !RegionServer
exit: on startup, the log is rerun to reconstruct an HRegion's last wholesome
state. But files in HDFS do not 'exist' until they are cleanly closed --
something that will not happen if !RegionServer exits without running its
'close'.
+ 1. Vuk Ercegovac [[MailTo(vercego AT SPAMFREE us DOT ibm DOT com)]] of IBM
Almaden Research pointed out that keeping HBase HRegion edit logs in HDFS is
currently flawed. HBase writes edits to logs and to a memcache. The 'atomic'
write to the log is meant to serve as insurance against abnormal !RegionServer
exit: on startup, the log is rerun to reconstruct an HRegion's last wholesome
state. But files in HDFS do not 'exist' until they are cleanly closed --
something that will not happen if !RegionServer exits without running its
'close'. This issue will be addressed in HBase-0.19.0.
1. The HMemcache lookup structure is relatively inefficient

Release News:
- * HBase 0.2.0 was released in early August and runs on Hadoop 0.17.1.
+ * HBase-0.2.0 was released in early August and runs on Hadoop-0.17.1.
- * Work is underway on releasing HBase 0.2.1 for Hadoop 0.17.2.1, and HBase
0.3 for Hadoop 0.18. Looking farther ahead, HBase 0.4 will run on Hadoop 0.19.
+ * Work is underway on releasing HBase-0.2.1 for Hadoop-0.17.2.1, and
HBase-0.18.0 for Hadoop-0.18. Looking farther ahead, HBase-0.19.0 will run
with Hadoop-0.19. Note that the HBase release numbering schema has changed to
align with the version of Hadoop with which it runs: 0.3.x will be called
0.18.x, and 0.4.x will be called 0.19.x.

See
[https://issues.apache.org/jira/browse/HBASE?report=com.atlassian.jira.plugin.system.project:roadmap-panel&subset=3
hbase issues] for list of whats being currently worked on.

[Hadoop Wiki] Update of "Hbase/HbaseArchitecture" by izaakrubin

Reply via email to