+1 I've been involved in both development and review on the branch, and I believe it's now ready to get merged into trunk. Many thanks to all the contributors and reviewers!
Thanks, -Jing On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <kai.zh...@intel.com> wrote: > Non-binding +1 > > According to our extensive performance tests, striping + ISA-L coder based > erasure coding not only can save storage, but also can increase the > throughput of a client or a cluster. It will be a great addition to HDFS > and its users. Based on the latest branch codes, we also observed it's very > reliable in the concurrent tests. We'll provide the perf test report after > it's sorted out and hope it helps. > Thanks! > > Regards, > Kai > > -----Original Message----- > From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] > Sent: Wednesday, September 23, 2015 8:50 AM > To: hdfs-dev@hadoop.apache.org; common-...@hadoop.apache.org > Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk > > +1 > > Great addition to HDFS. Thanks all contributors for the nice work. > > Regards, > Uma > > On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezh...@cloudera.com> wrote: > > >Hi, > > > >I'd like to propose a vote to merge the HDFS-7285 feature branch back > >to trunk. Since November 2014 we have been designing and developing > >this feature under the umbrella JIRAs HDFS-7285 and HADOOP-11264, and > >have committed approximately 210 patches. > > > >The HDFS-7285 feature branch was created to support the first phase of > >HDFS erasure coding (HDFS-EC). The objective of HDFS-EC is to > >significantly reduce storage space usage in HDFS clusters. Instead of > >always creating 3 replicas of each block with 200% storage space > >overhead, HDFS-EC provides data durability through parity data blocks. > >With most EC configurations, the storage overhead is no more than 50%. > >Based on profiling results of production clusters, we decided to > >support EC with the striped block layout in the first phase, so that > >small files can be better handled. This means dividing each logical > >HDFS file block into smaller units (striping cells) and spreading them > >on a set of DataNodes in round-robin fashion. Parity cells are > >generated for each stripe of original data cells. We have made changes > >to NameNode, client, and DataNode to generalize the block concept and > >handle the mapping between a logical file block and its internal > >storage blocks. For further details please see the design doc on > >HDFS-7285. > >HADOOP-11264 focuses on providing flexible and high-performance codec > >calculation support. > > > >The nightly Jenkins job of the branch has reported several successful > >runs, and doesn't show new flaky tests compared with trunk. We have > >posted several versions of the test plan including both unit testing > >and cluster testing, and have executed most tests in the plan. The most > >basic functionalities have been extensively tested and verified in > >several real clusters with different hardware configurations; results > >have been very stable. We have created follow-on tasks for more > >advanced error handling and optimization under the umbrella HDFS-8031. > >We also plan to implement or harden the integration of EC with existing > >features such as WebHDFS, snapshot, append, truncate, hflush, hsync, > >and so forth. > > > >Development of this feature has been a collaboration across many > >companies and institutions. I'd like to thank J. Andreina, Takanobu > >Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma Maheswara Rao G, > >Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao Rui, Kai > >Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong Zhang, Jing > >Zhao, Hui Zheng and Kai Zheng for their code contributions and reviews. > >Andrew and Kai Zheng also made fundamental contributions to the initial > >design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and many other > >contributors have made great efforts in system testing. Many thanks go > >to Weihua Jiang for proposing the JIRA, and ATM, Todd Lipcon, Silvius > >Rus, Suresh, as well as many others for providing helpful feedbacks. > > > >Following the community convention, this vote will last for 7 days > >(ending September 29th). Votes from Hadoop committers are binding but > >non-binding votes are very welcome as well. And here's my non-binding +1. > > > >Thanks, > >--- > >Zhe Zhang > >