To add to the above discussion on umbrella JIRA: the COMMON side changes of EC have been tracked under HADOOP-11264 (before merge) and HADOOP-11842 (after merge).
--- Zhe Zhang On Tue, Nov 3, 2015 at 4:40 PM, Vinod Vavilapalli <vino...@hortonworks.com> wrote: > That makes sense. > > Thanks for the discussion everyone, let’s stick to this tentative plan of > EC for 2.9. > > I just updated the Roadmap wiki to reflect the same. > > +Vinod > > > > On Nov 2, 2015, at 4:26 PM, Zheng, Kai <kai.zh...@intel.com> wrote: > > > > Yeah, so for the issues we recently resolved on trunk and are addressing > as follow-on tasks in Phase I, we would label them with "erasure coding" > and maybe also set the target version as "2.9" for the convenience? > > > > -----Original Message----- > > From: Jing Zhao [mailto:ji...@apache.org] > > Sent: Tuesday, November 03, 2015 8:04 AM > > To: hdfs-dev@hadoop.apache.org > > Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 > (erasure coding) branch to trunk] > > > > +1 for the plan about Phase I & II. > > > > BTW, maybe out of the scope of this thread, just want to mention we > should either move the jira under HDFS-8031 or update the jira component as > "erasure-coding" when making further improvement or fixing bugs in EC. In > this way it will be easier for later backporting EC to 2.9. > > > > On Mon, Nov 2, 2015 at 3:48 PM, Vinayakumar B < > vinayakumarb.apa...@gmail.com > >> wrote: > > > >> +1 for the idea. > >> On Nov 3, 2015 07:22, "Zheng, Kai" <kai.zh...@intel.com> wrote: > >> > >>> Sounds good to me. When it's determined to include EC in 2.9 > >>> release, it may be good to have a rough release date as Zhe asked, > >>> so accordingly the scope of EC can be discussed out. We still have > >>> quite a few of things as Phase I follow-on tasks to do before EC can > >>> be deployed in a production system. Phase II to develop non-striping > >>> EC for cold data would possibly > >> be > >>> started after that. We might consider to include only Phase I and > >>> leave Phase II for next release according to the rough release date. > >>> > >>> Regards, > >>> Kai > >>> > >>> -----Original Message----- > >>> From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] > >>> Sent: Tuesday, November 03, 2015 5:41 AM > >>> To: hdfs-dev@hadoop.apache.org > >>> Subject: Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge > >>> HDFS-7285 (erasure coding) branch to trunk] > >>> > >>> +1 for EC to go into 2.9. Yes, 3.x would be long way to go when we > >>> +plan to > >>> have 2.8 and 2.9 releases. > >>> > >>> Regards, > >>> Uma > >>> > >>> On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vino...@hortonworks.com> > >> wrote: > >>> > >>>> Forking the thread. Started looking at the 2.8 list, various > >>>> features¹ status and arrived here. > >>>> > >>>> While I understand the pervasive nature of EC and a need for a > >>>> significant bake-in, moving this to a 3.x release is not a good idea. > >>>> We will surely get a 2.8 out this year and, as needed, I can even > >>>> spend time getting started on a 2.9. OTOH, 3.x is long ways off, > >>>> and given all the incompatibilities there, it would be a while > >>>> before users can get their hands on EC if it were to be only on > >>>> 3.x. At best, this may force sites that want EC to backport the > >>>> entire EC feature to older releases, at worst this will be repeat > >>>> the mess of 0.20 security release > >>> forks. > >>>> > >>>> If we think adding this to 2.8 (even if it switched off) is too > >>>> much risk per our original plan, let¹s move this to 2.9, there by > >>>> leaving enough time for stability, integration testing and bake-in, > >>>> and a realistic chance of having it end up on users¹ clusters soonish. > >>>> > >>>> +Vinod > >>>> > >>>>> On Oct 19, 2015, at 1:44 PM, Andrew Wang > >>>>> <andrew.w...@cloudera.com> > >>>>> wrote: > >>>>> > >>>>> I think our plan thus far has been to target this for 3.0. I'm > >>>>> okay with putting it in branch-2 if we've given a hard look at > >>>>> compatibility, but I'll note though that 2.8 is already looking > >>>>> like quite a large release, and our release bandwidth has been > >>>>> focused on the 2.6 and 2.7 maintenance releases. Adding another > >>>>> multi-hundred JIRAs to 2.8 might make it too unwieldy to get out > >>>>> the door. If we bump EC past that, 3.0 might very well be our > >>>>> next release vehicle. I do plan to revive the 3.0 schedule some > >>>>> time next year. With EC and > >>>>> JDK8 in a good spot, the only big feature remaining is classpath > >>>>> isolation. > >>>>> > >>>>> EC is also a pretty fundamental change to HDFS. Even if it's > >>>>> compatible, in terms of size and impact it might best belong in a > >>>>> new major release. > >>>>> > >>>>> Best, > >>>>> Andrew > >>>>> > >>>>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B < > >>>>> vinayakumarb.apa...@gmail.com> wrote: > >>>>> > >>>>>> Is anyone else also thinks that feature is ready to goto > >>>>>> branch-2 as well? > >>>>>> > >>>>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable > >>>>>> since then and ready to go in branch-2. > >>>>>> > >>>>>> -Vinay > >>>>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zhezh...@cloudera.com> wrote: > >>>>>> > >>>>>>> Thanks Vinay for capturing the issue and Uma for offering the help. > >>>>>>> > >>>>>>> --- > >>>>>>> Zhe Zhang > >>>>>>> > >>>>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma < > >>>>>> uma.ganguma...@intel.com > >>>>>>>> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Vinay, > >>>>>>>> > >>>>>>>> > >>>>>>>> I would merge them as part of HDFS-9182. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Uma > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" > >>>>>>>> <vinayakum...@apache.org> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi Andrew, > >>>>>>>>> I see CHANGES.txt entries not yet merged from > >>>>>> CHANGES-HDFS-EC-7285.txt. > >>>>>>>>> > >>>>>>>>> Was this intentional? > >>>>>>>>> > >>>>>>>>> Regards, > >>>>>>>>> Vinay > >>>>>>>>> > >>>>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang < > >>>>>> andrew.w...@cloudera.com> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Branch has been merged to trunk, thanks again to everyone > >>>>>>>>>> who worked > >>>>>>> on > >>>>>>>>>> the > >>>>>>>>>> feature! > >>>>>>>>>> > >>>>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang > >>>>>>>>>> <zhezh...@cloudera.com> > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Thanks everyone who has participated in this discussion. > >>>>>>>>>>> > >>>>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this > >>>>>>>>>>> vote > >>>>>> has > >>>>>>>>>> passed. > >>>>>>>>>>> I will do a final 'git merge' with trunk and work with > >>>>>>>>>>> Andrew to > >>>>>>> merge > >>>>>>>>>> the > >>>>>>>>>>> branch to trunk. I'll update on this thread when the merge > >>>>>>>>>>> is > >>>>>> done. > >>>>>>>>>>> > >>>>>>>>>>> --- > >>>>>>>>>>> Zhe Zhang > >>>>>>>>>>> > >>>>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A > >>>>>>>>>>> <yi.a....@intel.com> > >>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> (Change it to binding.) > >>>>>>>>>>>> > >>>>>>>>>>>> +1 > >>>>>>>>>>>> I have been involved in the development and code review on > >>>>>>>>>>>> the > >>>>>>>>>> feature > >>>>>>>>>>>> branch. It's a great feature and I think it's ready to > >>>>>>>>>>>> merge it > >>>>>>> into > >>>>>>>>>>> trunk. > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks all for the contribution. > >>>>>>>>>>>> > >>>>>>>>>>>> Regards, > >>>>>>>>>>>> Yi Liu > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>> From: Liu, Yi A > >>>>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM > >>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org > >>>>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) > >>>>>>>>>>>> branch to > >>>>>>> trunk > >>>>>>>>>>>> > >>>>>>>>>>>> +1 (non-binding) > >>>>>>>>>>>> I have been involved in the development and code review on > >>>>>>>>>>>> the > >>>>>>>>>> feature > >>>>>>>>>>>> branch. It's a great feature and I think it's ready to > >>>>>>>>>>>> merge it > >>>>>>> into > >>>>>>>>>>> trunk. > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks all for the contribution. > >>>>>>>>>>>> > >>>>>>>>>>>> Regards, > >>>>>>>>>>>> Yi Liu > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>> From: Vinayakumar B [mailto:vinayakum...@apache.org] > >>>>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM > >>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org > >>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) > >>>>>>>>>>>> branch to > >>>>>>> trunk > >>>>>>>>>>>> > >>>>>>>>>>>> +1, > >>>>>>>>>>>> > >>>>>>>>>>>> I've been involved starting from design and development of > >>>>>>>>>> ErasureCoding. > >>>>>>>>>>>> I think phase 1 of this development is ready to be merged > >>>>>>>>>>>> to > >>>>>>> trunk. > >>>>>>>>>>>> It had come a long way to the current state with > >>>>>>>>>>>> significant > >>>>>>> effort > >>>>>>>>>> of > >>>>>>>>>>>> many Contributors and Reviewers for both design and code. > >>>>>>>>>>>> > >>>>>>>>>>>> Thanks Everyone for the efforts. > >>>>>>>>>>>> > >>>>>>>>>>>> Regards, > >>>>>>>>>>>> Vinay > >>>>>>>>>>>> > >>>>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao > >>>>>>>>>>>> <ji...@apache.org> > >>>>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> +1 > >>>>>>>>>>>>> > >>>>>>>>>>>>> I've been involved in both development and review on the > >>>>>> branch, > >>>>>>>>>> and > >>>>>>>>>> I > >>>>>>>>>>>>> believe it's now ready to get merged into trunk. Many > >>>>>>>>>>>>> thanks > >>>>>> to > >>>>>>>>>> all > >>>>>>>>>>>>> the contributors and reviewers! > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>> -Jing > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai < > >>>>>>> kai.zh...@intel.com> > >>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Non-binding +1 > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> According to our extensive performance tests, striping + > >>>>>> ISA-L > >>>>>>>>>> coder > >>>>>>>>>>>>> based > >>>>>>>>>>>>>> erasure coding not only can save storage, but also can > >>>>>>> increase > >>>>>>>>>> the > >>>>>>>>>>>>>> throughput of a client or a cluster. It will be a great > >>>>>>>>>> addition to > >>>>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we > >>>>>> also > >>>>>>>>>>>>>> observed it's > >>>>>>>>>>>>> very > >>>>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf > >>>>>> test > >>>>>>>>>> report > >>>>>>>>>>>>> after > >>>>>>>>>>>>>> it's sorted out and hope it helps. > >>>>>>>>>>>>>> Thanks! > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>> Kai > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] > >>>>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM > >>>>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org; > >>>>>> common-...@hadoop.apache.org > >>>>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) > >>>>>>>>>>>>>> branch > >>>>>> to > >>>>>>>>>> trunk > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> +1 > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the > >>>>>>>>>>>>>> nice > >>>>>>>>>> work. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>> Uma > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezh...@cloudera.com> > >>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 > >>>>>>>>>>>>>>> feature > >>>>>>>>>> branch > >>>>>>>>>>>>>>> back to trunk. Since November 2014 we have been > >>>>>>>>>>>>>>> designing > >>>>>> and > >>>>>>>>>>>>>>> developing this feature under the umbrella JIRAs > >>>>>>>>>>>>>>> HDFS-7285 > >>>>>>> and > >>>>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the > >>>>>> first > >>>>>>>>>> phase > >>>>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of > >>>>>>>>>>>>>>> HDFS-EC > >>>>>> is > >>>>>>>>>> to > >>>>>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters. > >>>>>>>>>> Instead > >>>>>>>>>>>>>>> of always creating 3 replicas of each block with 200% > >>>>>> storage > >>>>>>>>>> space > >>>>>>>>>>>>>>> overhead, HDFS-EC provides data durability through > >>>>>>>>>>>>>>> parity > >>>>>>> data > >>>>>>>>>>> blocks. > >>>>>>>>>>>>>>> With most EC configurations, the storage overhead is no > >>>>>> more > >>>>>>>>>> than > >>>>>>>>>>> 50%. > >>>>>>>>>>>>>>> Based on profiling results of production clusters, we > >>>>>> decided > >>>>>>>>>> to > >>>>>>>>>>>>>>> support EC with the striped block layout in the first > >>>>>> phase, > >>>>>>> so > >>>>>>>>>>>>>>> that small files can be better handled. This means > >>>>>>>>>>>>>>> dividing > >>>>>>>>>> each > >>>>>>>>>>>>>>> logical HDFS file block into smaller units (striping > >>>>>>>>>>>>>>> cells) > >>>>>>> and > >>>>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin > >>>>>> fashion. > >>>>>>>>>> Parity > >>>>>>>>>>>>>>> cells are generated for each stripe of original data cells. > >>>>>>> We > >>>>>>>>>> have > >>>>>>>>>>>>>>> made changes to NameNode, client, and DataNode to > >>>>>> generalize > >>>>>>>>>> the > >>>>>>>>>>>>>>> block concept and handle the mapping between a logical > >>>>>>>>>>>>>>> file > >>>>>>>>>> block > >>>>>>>>>>>>>>> and its internal storage blocks. For further details > >>>>>>>>>>>>>>> please > >>>>>>> see > >>>>>>>>>> the > >>>>>>>>>>>>>>> design doc on HDFS-7285. > >>>>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and > >>>>>>> high-performance > >>>>>>>>>>>>>>> codec calculation support. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> The nightly Jenkins job of the branch has reported > >>>>>>>>>>>>>>> several successful runs, and doesn't show new flaky > >>>>>>>>>>>>>>> tests compared > >>>>>>> with > >>>>>>>>>>>>>>> trunk. We have posted several versions of the test plan > >>>>>>>>>> including > >>>>>>>>>>>>>>> both unit testing and cluster testing, and have > >>>>>>>>>>>>>>> executed > >>>>>> most > >>>>>>>>>> tests > >>>>>>>>>>>>>>> in the plan. The most basic functionalities have been > >>>>>>>>>> extensively > >>>>>>>>>>>>>>> tested and verified in several real clusters with > >>>>>>>>>>>>>>> different hardware configurations; results have been > >>>>>>>>>>>>>>> very stable. We > >>>>>>> have > >>>>>>>>>>>>>>> created follow-on tasks for more advanced error > >>>>>>>>>>>>>>> handling > >>>>>> and > >>>>>>>>>>>> optimization under the umbrella HDFS-8031. > >>>>>>>>>>>>>>> We also plan to implement or harden the integration of > >>>>>>>>>>>>>>> EC > >>>>>>> with > >>>>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append, > >>>>>>> truncate, > >>>>>>>>>>>>>>> hflush, hsync, and so forth. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Development of this feature has been a collaboration > >>>>>>>>>>>>>>> across > >>>>>>>>>> many > >>>>>>>>>>>>>>> companies and institutions. I'd like to thank J. > >>>>>>>>>>>>>>> Andreina, > >>>>>>>>>> Takanobu > >>>>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma > >>>>>> Maheswara > >>>>>>>>>> Rao > >>>>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, > >>>>>>>>>>>>>>> Gao > >>>>>>> Rui, > >>>>>>>>>> Kai > >>>>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, > >>>>>>>>>>>>>>> Yong > >>>>>>>>>> Zhang, > >>>>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code > >>>>>>> contributions > >>>>>>>>>> and > >>>>>>>>>>>> reviews. > >>>>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental > >>>>>>>>>>>>>>> contributions to > >>>>>>> the > >>>>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng > >>>>>>>>>>>>>>> and > >>>>>>> many > >>>>>>>>>>>>>>> other contributors have made great efforts in system > >>>>>> testing. > >>>>>>>>>> Many > >>>>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and > >>>>>>>>>>>>>>> ATM, > >>>>>>> Todd > >>>>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for > >>>>>>>>>> providing > >>>>>>>>>>>> helpful feedbacks. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Following the community convention, this vote will last > >>>>>> for 7 > >>>>>>>>>> days > >>>>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers > >>>>>>>>>>>>>>> are > >>>>>>>>>> binding > >>>>>>>>>>>>>>> but non-binding votes are very welcome as well. And > >>>>>>>>>>>>>>> here's > >>>>>> my > >>>>>>>>>>>>>>> non-binding > >>>>>>>>>>>>> +1. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>> --- > >>>>>>>>>>>>>>> Zhe Zhang > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>> > >>> > >>> > >> > >