Hey Matt, You can see the full change log here: http://archive.cloudera.com/cdh/3/hadoop-0.20.2+923.97.CHANGES.txt
Most changes done for HBase have it listed in the "Reason" field. There's a directory in the source tarball that contains all the individual patches broken out. Cheers, Eli On Fri, Sep 2, 2011 at 2:34 PM, Matt Foley <[email protected]> wrote: > Hi Todd, > Thank you, this is tremendously valuable input! I'll have to look in detail > at each of these ten jiras, > and will get back to the list with more info shortly. > --Matt > > On Fri, Sep 2, 2011 at 1:03 PM, Todd Lipcon <[email protected]> wrote: > >> The following other JIRAs have been committed in CDH for 18 months or >> so, for the purpose of HBase. You may want to consider backporting >> them as well - many were never committed to 0.20-append due to lack of >> reviews by HDFS committers at the time. >> >> HDFS-1056. Fix possible multinode deadlocks during block recovery >> when using ephemeral dataxceiv >> >> Description: Fixes the logic by which datanodes identify local RPC >> targets >> during block recovery for the case when the datanode >> is configured with an ephemeral data transceiver port. >> Reason: Potential internode deadlock for clusters using ephemeral ports >> >> >> HADOOP-6722. Workaround a TCP spec quirk by not allowing >> NetUtils.connect to connect to itself >> >> Description: TCP's ephemeral port assignment results in the possibility >> that a client can connect back to its own outgoing socket, >> resulting in failed RPCs or datanode transfers. >> Reason: Fixes intermittent errors in cluster testing with ephemeral >> IPC/transceiver ports on datanodes. >> >> HDFS-1122. Don't allow client verification to prematurely add >> inprogress blocks to DataBlockScanner >> >> Description: When a client reads a block that is also open for writing, >> it should not add it to the datanode block scanner. >> If it does, the block scanner can incorrectly mark the >> block as corrupt, causing data loss. >> Reason: Potential dataloss with concurrent writer-reader case. >> >> HDFS-1248. Miscellaneous cleanup and improvements on 0.20 append branch >> >> Description: Miscellaneous code cleanup and logging changes, including: >> - Slight cleanup to recoverFile() function in TestFileAppend4 >> - Improve error messages on OP_READ_BLOCK >> - Some comment cleanup in FSNamesystem >> - Remove toInodeUnderConstruction (was not used) >> - Add some checks for null blocks in FSNamesystem to avoid a possible >> NPE >> - Only log "inconsistent size" warnings at WARN level for >> non-under-construction blocks. >> - Redundant addStoredBlock calls are also not worthy of WARN level >> - Add some extra information to a warning in ReplicationTargetChooser >> Reason: Improves diagnosis of error cases and clarity of code >> >> >> HDFS-1242. Add unit test for the appendFile race condition / >> synchronization bug fixed in HDFS-142 >> >> Reason: Test coverage for previously applied patch. >> >> HDFS-1218. Replicas that are recovered during DN startup should >> not be allowed to truncate better replicas. >> >> Description: If a datanode loses power and then recovers, its replicas >> may be truncated due to the recovery of the local FS >> journal. This patch ensures that a replica truncated by >> a power loss does not truncate the block on HDFS. >> Reason: Potential dataloss bug uncovered by power failure simulation >> >> HDFS-915. Write pipeline hangs for too long when ResponseProcessor >> hits timeout >> >> Description: Previously, the write pipeline would hang for the entire >> write >> timeout when it encountered a read timeout (eg due to a >> network connectivity issue). This patch interrupts the >> writing >> thread when a read error occurs. >> Reason: Faster recovery from pipeline failure for HBase and other >> interactive applications. >> >> >> HDFS-1186. Writers should be interrupted when recovery is started, >> not when it's completed. >> >> Description: When the write pipeline recovery process is initiated, this >> interrupts any concurrent writers to the block under >> recovery. >> This prevents a case where some edits may be lost if the >> writer has lost its lease but continues to write (eg due to >> a garbage collection pause) >> Reason: Fixes a potential dataloss bug >> >> >> commit a960eea40dbd6a4e87072bdf73ac3b62e772f70a >> Author: Todd Lipcon <[email protected]> >> Date: Sun Jun 13 23:02:38 2010 -0700 >> >> HDFS-1197. Received blocks should not be added to block map >> prematurely for under construction files >> >> Description: Fixes a possible dataloss scenario when using append() on >> real-life clusters. Also augments unit tests to uncover >> similar bugs in the future by simulating latency when >> reporting blocks received by datanodes. >> Reason: Append support dataloss bug >> Author: Todd Lipcon >> >> >> HDFS-1260. tryUpdateBlock should do validation before renaming meta file >> >> Description: Solves bug where block became inaccessible in certain >> failure >> conditions (particularly network partitions). Observed >> under >> HBase workload at user site. >> Reason: Potential loss of syunced data when write pipeline fails >> >> >> On Fri, Sep 2, 2011 at 11:20 AM, Suresh Srinivas <[email protected]> >> wrote: >> > I also propose following jiras, which are non append related bug fixes >> from >> > 0.20-append branch: >> > >> > - HDFS-1164. TestHdfsProxy is failing. >> > - HDFS-1211. Block receiver should not log "rewind" packets at INFO >> > level. >> > - HDFS-1118. Fix socketleak on DFSClient. >> > - HDFS-1210. DFSClient should log exception when block recovery fails. >> > - HDFS-606. Fix ConcurrentModificationException in >> > invalidateCorruptReplicas. >> > - HDFS-561. Fix write pipeline READ_TIMEOUT. >> > - HDFS-1202. DataBlockScanner throws NPE when updated before >> > initialized. >> > >> > Risk Level: >> > These are useful bugfixes from append branch and are not big changes to >> the >> > code base. >> > >> > These jiras have already been merged into 0.20-security branch. >> > >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> >
