I propose that the 20-append patches (details below)  be included in 20.205 
which will become the first official Apache
release of Hadoop that supports Append and HBase.

Background:
There hasn't been a official Apache release that supports HBase. 
The HBase community have instead been using the 20-append branch; the patches 
were contributed by the HBase community including Facebook. The Cloudera 
distribution has also included these patches.
Andrew Purtell has ported these patches to 20-security branch.

Risk Level:
These patches have been used and tested on large HBase clusters by FB , by 
those who use 20-append branch directly (various users including a 500 node 
HBase cluster at Yahoo) and by those that use the Cloudera distribution. We 
have reviewed the patches and have conducted further tests; testing and 
validation continues.


Patches:
HDFS-200. Support append and sync for hadoop 0.20 branch.
HDFS-142. Blocks that are being written by a client are stored in the 
blocksBeingWritten directory.
HDFS-1057.  Concurrent readers hit ChecksumExceptions if following a writer to 
very end of file
HDFS-724.  Use a bidirectional heartbeat to detect stuck pipeline.
HDFS-895. Allow hflush/sync to occur in parallel with new writes to the file.
HDFS-1520. Lightweight NameNode operation recoverLease to trigger lease 
recovery.
HDFS-1555. Disallow pipelien recovery if a file is already being lease 
recovered.
HDFS-1554. New semantics for recoverLease.
HDFS-988. Fix bug where savenameSpace can corrupt edits log.
HDFS-826. Allow a mechanism for an application to detect that datanode(s) have 
died in the write pipeline.
HDFS-630. Client can exclude specific nodes in the write pipeline.
HDFS-1141. completeFile does not check lease ownership.
HDFS-1204. Lease expiration should recover single files, not entire lease holder
HDFS-1254. Support append/sync via the default configuration.
HDFS-1346. DFSClient receives out of order packet ack.
HDFS-1054. remove sleep before retry for allocating a block.

Reply via email to