Thanks Vinod for the proposal and Andrew/Jing for the comments. 2.9 sounds a good plan. As Andrew pointed out, 2.8 is already quite big. And even when disabled, EC logic has been baked in NN pretty deeply.
Do we have a tentative date or estimate for 2.9? --- Zhe Zhang On Mon, Nov 2, 2015 at 1:22 PM, Jing Zhao <ji...@apache.org> wrote: > Thanks for the discussion, Vinod and Andrew. Backporting EC to 2.9 sounds > good to me. > > On Mon, Nov 2, 2015 at 12:06 PM, Andrew Wang <andrew.w...@cloudera.com> > wrote: > > > Thanks for forking the thread Vinod, > > > > SGTM, though I really do recommend waiting for 2.9 given the current size > > of 2.8. I'm not a fan of an "off by default" half-measure, since it > doesn't > > change our compatibility requirements, and there's some major NN surgery > > that can't really be disabled. > > > > If we do find a major user who's backported this to their own branch-2 > > fork, I agree that's motivation to get it in an upstream release > quicker. I > > haven't heard anything along these lines though. > > > > On Mon, Nov 2, 2015 at 11:49 AM, Vinod Vavilapalli < > > vino...@hortonworks.com> > > wrote: > > > > > Forking the thread. Started looking at the 2.8 list, various features’ > > > status and arrived here. > > > > > > While I understand the pervasive nature of EC and a need for a > > significant > > > bake-in, moving this to a 3.x release is not a good idea. We will > surely > > > get a 2.8 out this year and, as needed, I can even spend time getting > > > started on a 2.9. OTOH, 3.x is long ways off, and given all the > > > incompatibilities there, it would be a while before users can get their > > > hands on EC if it were to be only on 3.x. At best, this may force sites > > > that want EC to backport the entire EC feature to older releases, at > > worst > > > this will be repeat the mess of 0.20 security release forks. > > > > > > If we think adding this to 2.8 (even if it switched off) is too much > risk > > > per our original plan, let’s move this to 2.9, there by leaving enough > > time > > > for stability, integration testing and bake-in, and a realistic chance > of > > > having it end up on users’ clusters soonish. > > > > > > +Vinod > > > > > > > On Oct 19, 2015, at 1:44 PM, Andrew Wang <andrew.w...@cloudera.com> > > > wrote: > > > > > > > > I think our plan thus far has been to target this for 3.0. I'm okay > > with > > > > putting it in branch-2 if we've given a hard look at compatibility, > but > > > > I'll note though that 2.8 is already looking like quite a large > > release, > > > > and our release bandwidth has been focused on the 2.6 and 2.7 > > maintenance > > > > releases. Adding another multi-hundred JIRAs to 2.8 might make it too > > > > unwieldy to get out the door. If we bump EC past that, 3.0 might very > > > well > > > > be our next release vehicle. I do plan to revive the 3.0 schedule > some > > > time > > > > next year. With EC and JDK8 in a good spot, the only big feature > > > remaining > > > > is classpath isolation. > > > > > > > > EC is also a pretty fundamental change to HDFS. Even if it's > > compatible, > > > in > > > > terms of size and impact it might best belong in a new major release. > > > > > > > > Best, > > > > Andrew > > > > > > > > On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B < > > > > vinayakumarb.apa...@gmail.com> wrote: > > > > > > > >> Is anyone else also thinks that feature is ready to goto branch-2 > as > > > well? > > > >> > > > >> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since > > then > > > and > > > >> ready to go in branch-2. > > > >> > > > >> -Vinay > > > >> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zhezh...@cloudera.com> wrote: > > > >> > > > >>> Thanks Vinay for capturing the issue and Uma for offering the help. > > > >>> > > > >>> --- > > > >>> Zhe Zhang > > > >>> > > > >>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma < > > > >> uma.ganguma...@intel.com > > > >>>> > > > >>> wrote: > > > >>> > > > >>>> Vinay, > > > >>>> > > > >>>> > > > >>>> I would merge them as part of HDFS-9182. > > > >>>> > > > >>>> Thanks, > > > >>>> Uma > > > >>>> > > > >>>> > > > >>>> > > > >>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vinayakum...@apache.org> > > > wrote: > > > >>>> > > > >>>>> Hi Andrew, > > > >>>>> I see CHANGES.txt entries not yet merged from > > > >> CHANGES-HDFS-EC-7285.txt. > > > >>>>> > > > >>>>> Was this intentional? > > > >>>>> > > > >>>>> Regards, > > > >>>>> Vinay > > > >>>>> > > > >>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang < > > > >> andrew.w...@cloudera.com> > > > >>>>> wrote: > > > >>>>> > > > >>>>>> Branch has been merged to trunk, thanks again to everyone who > > worked > > > >>> on > > > >>>>>> the > > > >>>>>> feature! > > > >>>>>> > > > >>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang < > > zhezh...@cloudera.com> > > > >>>>>> wrote: > > > >>>>>> > > > >>>>>>> Thanks everyone who has participated in this discussion. > > > >>>>>>> > > > >>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote > > > >> has > > > >>>>>> passed. > > > >>>>>>> I will do a final 'git merge' with trunk and work with Andrew > to > > > >>> merge > > > >>>>>> the > > > >>>>>>> branch to trunk. I'll update on this thread when the merge is > > > >> done. > > > >>>>>>> > > > >>>>>>> --- > > > >>>>>>> Zhe Zhang > > > >>>>>>> > > > >>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A < > yi.a....@intel.com> > > > >>>>>> wrote: > > > >>>>>>> > > > >>>>>>>> (Change it to binding.) > > > >>>>>>>> > > > >>>>>>>> +1 > > > >>>>>>>> I have been involved in the development and code review on the > > > >>>>>> feature > > > >>>>>>>> branch. It's a great feature and I think it's ready to merge > it > > > >>> into > > > >>>>>>> trunk. > > > >>>>>>>> > > > >>>>>>>> Thanks all for the contribution. > > > >>>>>>>> > > > >>>>>>>> Regards, > > > >>>>>>>> Yi Liu > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> -----Original Message----- > > > >>>>>>>> From: Liu, Yi A > > > >>>>>>>> Sent: Friday, September 25, 2015 1:51 PM > > > >>>>>>>> To: hdfs-dev@hadoop.apache.org > > > >>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to > > > >>> trunk > > > >>>>>>>> > > > >>>>>>>> +1 (non-binding) > > > >>>>>>>> I have been involved in the development and code review on the > > > >>>>>> feature > > > >>>>>>>> branch. It's a great feature and I think it's ready to merge > it > > > >>> into > > > >>>>>>> trunk. > > > >>>>>>>> > > > >>>>>>>> Thanks all for the contribution. > > > >>>>>>>> > > > >>>>>>>> Regards, > > > >>>>>>>> Yi Liu > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> -----Original Message----- > > > >>>>>>>> From: Vinayakumar B [mailto:vinayakum...@apache.org] > > > >>>>>>>> Sent: Friday, September 25, 2015 12:21 PM > > > >>>>>>>> To: hdfs-dev@hadoop.apache.org > > > >>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to > > > >>> trunk > > > >>>>>>>> > > > >>>>>>>> +1, > > > >>>>>>>> > > > >>>>>>>> I've been involved starting from design and development of > > > >>>>>> ErasureCoding. > > > >>>>>>>> I think phase 1 of this development is ready to be merged to > > > >>> trunk. > > > >>>>>>>> It had come a long way to the current state with significant > > > >>> effort > > > >>>>>> of > > > >>>>>>>> many Contributors and Reviewers for both design and code. > > > >>>>>>>> > > > >>>>>>>> Thanks Everyone for the efforts. > > > >>>>>>>> > > > >>>>>>>> Regards, > > > >>>>>>>> Vinay > > > >>>>>>>> > > > >>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org > > > > > >>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>>> +1 > > > >>>>>>>>> > > > >>>>>>>>> I've been involved in both development and review on the > > > >> branch, > > > >>>>>> and > > > >>>>>> I > > > >>>>>>>>> believe it's now ready to get merged into trunk. Many thanks > > > >> to > > > >>>>>> all > > > >>>>>>>>> the contributors and reviewers! > > > >>>>>>>>> > > > >>>>>>>>> Thanks, > > > >>>>>>>>> -Jing > > > >>>>>>>>> > > > >>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai < > > > >>> kai.zh...@intel.com> > > > >>>>>>> wrote: > > > >>>>>>>>> > > > >>>>>>>>>> Non-binding +1 > > > >>>>>>>>>> > > > >>>>>>>>>> According to our extensive performance tests, striping + > > > >> ISA-L > > > >>>>>> coder > > > >>>>>>>>> based > > > >>>>>>>>>> erasure coding not only can save storage, but also can > > > >>> increase > > > >>>>>> the > > > >>>>>>>>>> throughput of a client or a cluster. It will be a great > > > >>>>>> addition to > > > >>>>>>>>>> HDFS and its users. Based on the latest branch codes, we > > > >> also > > > >>>>>>>>>> observed it's > > > >>>>>>>>> very > > > >>>>>>>>>> reliable in the concurrent tests. We'll provide the perf > > > >> test > > > >>>>>> report > > > >>>>>>>>> after > > > >>>>>>>>>> it's sorted out and hope it helps. > > > >>>>>>>>>> Thanks! > > > >>>>>>>>>> > > > >>>>>>>>>> Regards, > > > >>>>>>>>>> Kai > > > >>>>>>>>>> > > > >>>>>>>>>> -----Original Message----- > > > >>>>>>>>>> From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] > > > >>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM > > > >>>>>>>>>> To: hdfs-dev@hadoop.apache.org; > > > >> common-...@hadoop.apache.org > > > >>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch > > > >> to > > > >>>>>> trunk > > > >>>>>>>>>> > > > >>>>>>>>>> +1 > > > >>>>>>>>>> > > > >>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice > > > >>>>>> work. > > > >>>>>>>>>> > > > >>>>>>>>>> Regards, > > > >>>>>>>>>> Uma > > > >>>>>>>>>> > > > >>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezh...@cloudera.com> > > > >>> wrote: > > > >>>>>>>>>> > > > >>>>>>>>>>> Hi, > > > >>>>>>>>>>> > > > >>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature > > > >>>>>> branch > > > >>>>>>>>>>> back to trunk. Since November 2014 we have been designing > > > >> and > > > >>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285 > > > >>> and > > > >>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches. > > > >>>>>>>>>>> > > > >>>>>>>>>>> The HDFS-7285 feature branch was created to support the > > > >> first > > > >>>>>> phase > > > >>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC > > > >> is > > > >>>>>> to > > > >>>>>>>>>>> significantly reduce storage space usage in HDFS clusters. > > > >>>>>> Instead > > > >>>>>>>>>>> of always creating 3 replicas of each block with 200% > > > >> storage > > > >>>>>> space > > > >>>>>>>>>>> overhead, HDFS-EC provides data durability through parity > > > >>> data > > > >>>>>>> blocks. > > > >>>>>>>>>>> With most EC configurations, the storage overhead is no > > > >> more > > > >>>>>> than > > > >>>>>>> 50%. > > > >>>>>>>>>>> Based on profiling results of production clusters, we > > > >> decided > > > >>>>>> to > > > >>>>>>>>>>> support EC with the striped block layout in the first > > > >> phase, > > > >>> so > > > >>>>>>>>>>> that small files can be better handled. This means dividing > > > >>>>>> each > > > >>>>>>>>>>> logical HDFS file block into smaller units (striping cells) > > > >>> and > > > >>>>>>>>>>> spreading them on a set of DataNodes in round-robin > > > >> fashion. > > > >>>>>> Parity > > > >>>>>>>>>>> cells are generated for each stripe of original data cells. > > > >>> We > > > >>>>>> have > > > >>>>>>>>>>> made changes to NameNode, client, and DataNode to > > > >> generalize > > > >>>>>> the > > > >>>>>>>>>>> block concept and handle the mapping between a logical file > > > >>>>>> block > > > >>>>>>>>>>> and its internal storage blocks. For further details please > > > >>> see > > > >>>>>> the > > > >>>>>>>>>>> design doc on HDFS-7285. > > > >>>>>>>>>>> HADOOP-11264 focuses on providing flexible and > > > >>> high-performance > > > >>>>>>>>>>> codec calculation support. > > > >>>>>>>>>>> > > > >>>>>>>>>>> The nightly Jenkins job of the branch has reported several > > > >>>>>>>>>>> successful runs, and doesn't show new flaky tests compared > > > >>> with > > > >>>>>>>>>>> trunk. We have posted several versions of the test plan > > > >>>>>> including > > > >>>>>>>>>>> both unit testing and cluster testing, and have executed > > > >> most > > > >>>>>> tests > > > >>>>>>>>>>> in the plan. The most basic functionalities have been > > > >>>>>> extensively > > > >>>>>>>>>>> tested and verified in several real clusters with different > > > >>>>>>>>>>> hardware configurations; results have been very stable. We > > > >>> have > > > >>>>>>>>>>> created follow-on tasks for more advanced error handling > > > >> and > > > >>>>>>>> optimization under the umbrella HDFS-8031. > > > >>>>>>>>>>> We also plan to implement or harden the integration of EC > > > >>> with > > > >>>>>>>>>>> existing features such as WebHDFS, snapshot, append, > > > >>> truncate, > > > >>>>>>>>>>> hflush, hsync, and so forth. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Development of this feature has been a collaboration across > > > >>>>>> many > > > >>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina, > > > >>>>>> Takanobu > > > >>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma > > > >> Maheswara > > > >>>>>> Rao > > > >>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao > > > >>> Rui, > > > >>>>>> Kai > > > >>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong > > > >>>>>> Zhang, > > > >>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code > > > >>> contributions > > > >>>>>> and > > > >>>>>>>> reviews. > > > >>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to > > > >>> the > > > >>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and > > > >>> many > > > >>>>>>>>>>> other contributors have made great efforts in system > > > >> testing. > > > >>>>>> Many > > > >>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM, > > > >>> Todd > > > >>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for > > > >>>>>> providing > > > >>>>>>>> helpful feedbacks. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Following the community convention, this vote will last > > > >> for 7 > > > >>>>>> days > > > >>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are > > > >>>>>> binding > > > >>>>>>>>>>> but non-binding votes are very welcome as well. And here's > > > >> my > > > >>>>>>>>>>> non-binding > > > >>>>>>>>> +1. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Thanks, > > > >>>>>>>>>>> --- > > > >>>>>>>>>>> Zhe Zhang > > > >>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>> > > > >>>> > > > >>> > > > >> > > > > > > > > >