+1 for EC to go into 2.9. Yes, 3.x would be long way to go when we plan to have 2.8 and 2.9 releases.
Regards, Uma On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vino...@hortonworks.com> wrote: >Forking the thread. Started looking at the 2.8 list, various features¹ >status and arrived here. > >While I understand the pervasive nature of EC and a need for a >significant bake-in, moving this to a 3.x release is not a good idea. We >will surely get a 2.8 out this year and, as needed, I can even spend time >getting started on a 2.9. OTOH, 3.x is long ways off, and given all the >incompatibilities there, it would be a while before users can get their >hands on EC if it were to be only on 3.x. At best, this may force sites >that want EC to backport the entire EC feature to older releases, at >worst this will be repeat the mess of 0.20 security release forks. > >If we think adding this to 2.8 (even if it switched off) is too much risk >per our original plan, let¹s move this to 2.9, there by leaving enough >time for stability, integration testing and bake-in, and a realistic >chance of having it end up on users¹ clusters soonish. > >+Vinod > >> On Oct 19, 2015, at 1:44 PM, Andrew Wang <andrew.w...@cloudera.com> >>wrote: >> >> I think our plan thus far has been to target this for 3.0. I'm okay with >> putting it in branch-2 if we've given a hard look at compatibility, but >> I'll note though that 2.8 is already looking like quite a large release, >> and our release bandwidth has been focused on the 2.6 and 2.7 >>maintenance >> releases. Adding another multi-hundred JIRAs to 2.8 might make it too >> unwieldy to get out the door. If we bump EC past that, 3.0 might very >>well >> be our next release vehicle. I do plan to revive the 3.0 schedule some >>time >> next year. With EC and JDK8 in a good spot, the only big feature >>remaining >> is classpath isolation. >> >> EC is also a pretty fundamental change to HDFS. Even if it's >>compatible, in >> terms of size and impact it might best belong in a new major release. >> >> Best, >> Andrew >> >> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B < >> vinayakumarb.apa...@gmail.com> wrote: >> >>> Is anyone else also thinks that feature is ready to goto branch-2 as >>>well? >>> >>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since >>>then and >>> ready to go in branch-2. >>> >>> -Vinay >>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zhezh...@cloudera.com> wrote: >>> >>>> Thanks Vinay for capturing the issue and Uma for offering the help. >>>> >>>> --- >>>> Zhe Zhang >>>> >>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma < >>> uma.ganguma...@intel.com >>>>> >>>> wrote: >>>> >>>>> Vinay, >>>>> >>>>> >>>>> I would merge them as part of HDFS-9182. >>>>> >>>>> Thanks, >>>>> Uma >>>>> >>>>> >>>>> >>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vinayakum...@apache.org> >>>>>wrote: >>>>> >>>>>> Hi Andrew, >>>>>> I see CHANGES.txt entries not yet merged from >>> CHANGES-HDFS-EC-7285.txt. >>>>>> >>>>>> Was this intentional? >>>>>> >>>>>> Regards, >>>>>> Vinay >>>>>> >>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang < >>> andrew.w...@cloudera.com> >>>>>> wrote: >>>>>> >>>>>>> Branch has been merged to trunk, thanks again to everyone who >>>>>>>worked >>>> on >>>>>>> the >>>>>>> feature! >>>>>>> >>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zhezh...@cloudera.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks everyone who has participated in this discussion. >>>>>>>> >>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote >>> has >>>>>>> passed. >>>>>>>> I will do a final 'git merge' with trunk and work with Andrew to >>>> merge >>>>>>> the >>>>>>>> branch to trunk. I'll update on this thread when the merge is >>> done. >>>>>>>> >>>>>>>> --- >>>>>>>> Zhe Zhang >>>>>>>> >>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi.a....@intel.com> >>>>>>> wrote: >>>>>>>> >>>>>>>>> (Change it to binding.) >>>>>>>>> >>>>>>>>> +1 >>>>>>>>> I have been involved in the development and code review on the >>>>>>> feature >>>>>>>>> branch. It's a great feature and I think it's ready to merge it >>>> into >>>>>>>> trunk. >>>>>>>>> >>>>>>>>> Thanks all for the contribution. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Yi Liu >>>>>>>>> >>>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Liu, Yi A >>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM >>>>>>>>> To: hdfs-dev@hadoop.apache.org >>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to >>>> trunk >>>>>>>>> >>>>>>>>> +1 (non-binding) >>>>>>>>> I have been involved in the development and code review on the >>>>>>> feature >>>>>>>>> branch. It's a great feature and I think it's ready to merge it >>>> into >>>>>>>> trunk. >>>>>>>>> >>>>>>>>> Thanks all for the contribution. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Yi Liu >>>>>>>>> >>>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Vinayakumar B [mailto:vinayakum...@apache.org] >>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM >>>>>>>>> To: hdfs-dev@hadoop.apache.org >>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to >>>> trunk >>>>>>>>> >>>>>>>>> +1, >>>>>>>>> >>>>>>>>> I've been involved starting from design and development of >>>>>>> ErasureCoding. >>>>>>>>> I think phase 1 of this development is ready to be merged to >>>> trunk. >>>>>>>>> It had come a long way to the current state with significant >>>> effort >>>>>>> of >>>>>>>>> many Contributors and Reviewers for both design and code. >>>>>>>>> >>>>>>>>> Thanks Everyone for the efforts. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Vinay >>>>>>>>> >>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org> >>>>>>> wrote: >>>>>>>>> >>>>>>>>>> +1 >>>>>>>>>> >>>>>>>>>> I've been involved in both development and review on the >>> branch, >>>>>>> and >>>>>>> I >>>>>>>>>> believe it's now ready to get merged into trunk. Many thanks >>> to >>>>>>> all >>>>>>>>>> the contributors and reviewers! >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> -Jing >>>>>>>>>> >>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai < >>>> kai.zh...@intel.com> >>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Non-binding +1 >>>>>>>>>>> >>>>>>>>>>> According to our extensive performance tests, striping + >>> ISA-L >>>>>>> coder >>>>>>>>>> based >>>>>>>>>>> erasure coding not only can save storage, but also can >>>> increase >>>>>>> the >>>>>>>>>>> throughput of a client or a cluster. It will be a great >>>>>>> addition to >>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we >>> also >>>>>>>>>>> observed it's >>>>>>>>>> very >>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf >>> test >>>>>>> report >>>>>>>>>> after >>>>>>>>>>> it's sorted out and hope it helps. >>>>>>>>>>> Thanks! >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Kai >>>>>>>>>>> >>>>>>>>>>> -----Original Message----- >>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com] >>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM >>>>>>>>>>> To: hdfs-dev@hadoop.apache.org; >>> common-...@hadoop.apache.org >>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch >>> to >>>>>>> trunk >>>>>>>>>>> >>>>>>>>>>> +1 >>>>>>>>>>> >>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice >>>>>>> work. >>>>>>>>>>> >>>>>>>>>>> Regards, >>>>>>>>>>> Uma >>>>>>>>>>> >>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezh...@cloudera.com> >>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature >>>>>>> branch >>>>>>>>>>>> back to trunk. Since November 2014 we have been designing >>> and >>>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285 >>>> and >>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches. >>>>>>>>>>>> >>>>>>>>>>>> The HDFS-7285 feature branch was created to support the >>> first >>>>>>> phase >>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC >>> is >>>>>>> to >>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters. >>>>>>> Instead >>>>>>>>>>>> of always creating 3 replicas of each block with 200% >>> storage >>>>>>> space >>>>>>>>>>>> overhead, HDFS-EC provides data durability through parity >>>> data >>>>>>>> blocks. >>>>>>>>>>>> With most EC configurations, the storage overhead is no >>> more >>>>>>> than >>>>>>>> 50%. >>>>>>>>>>>> Based on profiling results of production clusters, we >>> decided >>>>>>> to >>>>>>>>>>>> support EC with the striped block layout in the first >>> phase, >>>> so >>>>>>>>>>>> that small files can be better handled. This means dividing >>>>>>> each >>>>>>>>>>>> logical HDFS file block into smaller units (striping cells) >>>> and >>>>>>>>>>>> spreading them on a set of DataNodes in round-robin >>> fashion. >>>>>>> Parity >>>>>>>>>>>> cells are generated for each stripe of original data cells. >>>> We >>>>>>> have >>>>>>>>>>>> made changes to NameNode, client, and DataNode to >>> generalize >>>>>>> the >>>>>>>>>>>> block concept and handle the mapping between a logical file >>>>>>> block >>>>>>>>>>>> and its internal storage blocks. For further details please >>>> see >>>>>>> the >>>>>>>>>>>> design doc on HDFS-7285. >>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and >>>> high-performance >>>>>>>>>>>> codec calculation support. >>>>>>>>>>>> >>>>>>>>>>>> The nightly Jenkins job of the branch has reported several >>>>>>>>>>>> successful runs, and doesn't show new flaky tests compared >>>> with >>>>>>>>>>>> trunk. We have posted several versions of the test plan >>>>>>> including >>>>>>>>>>>> both unit testing and cluster testing, and have executed >>> most >>>>>>> tests >>>>>>>>>>>> in the plan. The most basic functionalities have been >>>>>>> extensively >>>>>>>>>>>> tested and verified in several real clusters with different >>>>>>>>>>>> hardware configurations; results have been very stable. We >>>> have >>>>>>>>>>>> created follow-on tasks for more advanced error handling >>> and >>>>>>>>> optimization under the umbrella HDFS-8031. >>>>>>>>>>>> We also plan to implement or harden the integration of EC >>>> with >>>>>>>>>>>> existing features such as WebHDFS, snapshot, append, >>>> truncate, >>>>>>>>>>>> hflush, hsync, and so forth. >>>>>>>>>>>> >>>>>>>>>>>> Development of this feature has been a collaboration across >>>>>>> many >>>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina, >>>>>>> Takanobu >>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma >>> Maheswara >>>>>>> Rao >>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao >>>> Rui, >>>>>>> Kai >>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong >>>>>>> Zhang, >>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code >>>> contributions >>>>>>> and >>>>>>>>> reviews. >>>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to >>>> the >>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and >>>> many >>>>>>>>>>>> other contributors have made great efforts in system >>> testing. >>>>>>> Many >>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM, >>>> Todd >>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for >>>>>>> providing >>>>>>>>> helpful feedbacks. >>>>>>>>>>>> >>>>>>>>>>>> Following the community convention, this vote will last >>> for 7 >>>>>>> days >>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are >>>>>>> binding >>>>>>>>>>>> but non-binding votes are very welcome as well. And here's >>> my >>>>>>>>>>>> non-binding >>>>>>>>>> +1. >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> --- >>>>>>>>>>>> Zhe Zhang >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>>> >>>> >>> >