Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Gangumalla, Uma Mon, 02 Nov 2015 13:41:48 -0800

+1 for EC to go into 2.9. Yes, 3.x would be long way to go when we plan to
have 2.8 and 2.9 releases.


Regards,
Uma

On 11/2/15, 11:49 AM, "Vinod Vavilapalli" <vino...@hortonworks.com> wrote:

>Forking the thread. Started looking at the 2.8 list, various features¹
>status and arrived here.
>
>While I understand the pervasive nature of EC and a need for a
>significant bake-in, moving this to a 3.x release is not a good idea. We
>will surely get a 2.8 out this year and, as needed, I can even spend time
>getting started on a 2.9. OTOH, 3.x is long ways off, and given all the
>incompatibilities there, it would be a while before users can get their
>hands on EC if it were to be only on 3.x. At best, this may force sites
>that want EC to backport the entire EC feature to older releases, at
>worst this will be repeat the mess of 0.20 security release forks.
>
>If we think adding this to 2.8 (even if it switched off) is too much risk
>per our original plan, let¹s move this to 2.9, there by leaving enough
>time for stability, integration testing and bake-in, and a realistic
>chance of having it end up on users¹ clusters soonish.
>
>+Vinod
>
>> On Oct 19, 2015, at 1:44 PM, Andrew Wang <andrew.w...@cloudera.com>
>>wrote:
>> 
>> I think our plan thus far has been to target this for 3.0. I'm okay with
>> putting it in branch-2 if we've given a hard look at compatibility, but
>> I'll note though that 2.8 is already looking like quite a large release,
>> and our release bandwidth has been focused on the 2.6 and 2.7
>>maintenance
>> releases. Adding another multi-hundred JIRAs to 2.8 might make it too
>> unwieldy to get out the door. If we bump EC past that, 3.0 might very
>>well
>> be our next release vehicle. I do plan to revive the 3.0 schedule some
>>time
>> next year. With EC and JDK8 in a good spot, the only big feature
>>remaining
>> is classpath isolation.
>> 
>> EC is also a pretty fundamental change to HDFS. Even if it's
>>compatible, in
>> terms of size and impact it might best belong in a new major release.
>> 
>> Best,
>> Andrew
>> 
>> On Fri, Oct 16, 2015 at 7:04 PM, Vinayakumar B <
>> vinayakumarb.apa...@gmail.com> wrote:
>> 
>>> Is anyone else also thinks that feature is ready to goto branch-2  as
>>>well?
>>> 
>>> Its > 2 weeks EC landed on trunk. IMo Looks Its quite stable since
>>>then and
>>> ready to go in branch-2.
>>> 
>>> -Vinay
>>> On Oct 6, 2015 12:51 AM, "Zhe Zhang" <zhezh...@cloudera.com> wrote:
>>> 
>>>> Thanks Vinay for capturing the issue and Uma for offering the help.
>>>> 
>>>> ---
>>>> Zhe Zhang
>>>> 
>>>> On Mon, Oct 5, 2015 at 12:19 PM, Gangumalla, Uma <
>>> uma.ganguma...@intel.com
>>>>> 
>>>> wrote:
>>>> 
>>>>> Vinay,
>>>>> 
>>>>> 
>>>>> I would merge them as part of HDFS-9182.
>>>>> 
>>>>> Thanks,
>>>>> Uma
>>>>> 
>>>>> 
>>>>> 
>>>>> On 10/5/15, 12:48 AM, "Vinayakumar B" <vinayakum...@apache.org>
>>>>>wrote:
>>>>> 
>>>>>> Hi Andrew,
>>>>>> I see CHANGES.txt entries not yet merged from
>>> CHANGES-HDFS-EC-7285.txt.
>>>>>> 
>>>>>> Was this intentional?
>>>>>> 
>>>>>> Regards,
>>>>>> Vinay
>>>>>> 
>>>>>> On Wed, Sep 30, 2015 at 9:15 PM, Andrew Wang <
>>> andrew.w...@cloudera.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Branch has been merged to trunk, thanks again to everyone who
>>>>>>>worked
>>>> on
>>>>>>> the
>>>>>>> feature!
>>>>>>> 
>>>>>>> On Tue, Sep 29, 2015 at 10:44 PM, Zhe Zhang <zhezh...@cloudera.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Thanks everyone who has participated in this discussion.
>>>>>>>> 
>>>>>>>> With 7 +1's (5 binding and 2 non-binding), and no -1, this vote
>>> has
>>>>>>> passed.
>>>>>>>> I will do a final 'git merge' with trunk and work with Andrew to
>>>> merge
>>>>>>> the
>>>>>>>> branch to trunk. I'll update on this thread when the merge is
>>> done.
>>>>>>>> 
>>>>>>>> ---
>>>>>>>> Zhe Zhang
>>>>>>>> 
>>>>>>>> On Thu, Sep 24, 2015 at 11:08 PM, Liu, Yi A <yi.a....@intel.com>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> (Change it to binding.)
>>>>>>>>> 
>>>>>>>>> +1
>>>>>>>>> I have been involved in the development and code review on the
>>>>>>> feature
>>>>>>>>> branch. It's a great feature and I think it's ready to merge it
>>>> into
>>>>>>>> trunk.
>>>>>>>>> 
>>>>>>>>> Thanks all for the contribution.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Yi Liu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Liu, Yi A
>>>>>>>>> Sent: Friday, September 25, 2015 1:51 PM
>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>> Subject: RE: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>>> trunk
>>>>>>>>> 
>>>>>>>>> +1 (non-binding)
>>>>>>>>> I have been involved in the development and code review on the
>>>>>>> feature
>>>>>>>>> branch. It's a great feature and I think it's ready to merge it
>>>> into
>>>>>>>> trunk.
>>>>>>>>> 
>>>>>>>>> Thanks all for the contribution.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Yi Liu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Vinayakumar B [mailto:vinayakum...@apache.org]
>>>>>>>>> Sent: Friday, September 25, 2015 12:21 PM
>>>>>>>>> To: hdfs-dev@hadoop.apache.org
>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to
>>>> trunk
>>>>>>>>> 
>>>>>>>>> +1,
>>>>>>>>> 
>>>>>>>>> I've been involved starting from design and development of
>>>>>>> ErasureCoding.
>>>>>>>>> I think phase 1 of this development is ready to be merged to
>>>> trunk.
>>>>>>>>> It had come a long way to the current state with significant
>>>> effort
>>>>>>> of
>>>>>>>>> many Contributors and Reviewers for both design and code.
>>>>>>>>> 
>>>>>>>>> Thanks Everyone for the efforts.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Vinay
>>>>>>>>> 
>>>>>>>>> On Wed, Sep 23, 2015 at 10:53 PM, Jing Zhao <ji...@apache.org>
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> +1
>>>>>>>>>> 
>>>>>>>>>> I've been involved in both development and review on the
>>> branch,
>>>>>>> and
>>>>>>> I
>>>>>>>>>> believe it's now ready to get merged into trunk. Many thanks
>>> to
>>>>>>> all
>>>>>>>>>> the contributors and reviewers!
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> -Jing
>>>>>>>>>> 
>>>>>>>>>> On Tue, Sep 22, 2015 at 6:17 PM, Zheng, Kai <
>>>> kai.zh...@intel.com>
>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Non-binding +1
>>>>>>>>>>> 
>>>>>>>>>>> According to our extensive performance tests, striping +
>>> ISA-L
>>>>>>> coder
>>>>>>>>>> based
>>>>>>>>>>> erasure coding not only can save storage, but also can
>>>> increase
>>>>>>> the
>>>>>>>>>>> throughput of a client or a cluster. It will be a great
>>>>>>> addition to
>>>>>>>>>>> HDFS and its users. Based on the latest branch codes, we
>>> also
>>>>>>>>>>> observed it's
>>>>>>>>>> very
>>>>>>>>>>> reliable in the concurrent tests. We'll provide the perf
>>> test
>>>>>>> report
>>>>>>>>>> after
>>>>>>>>>>> it's sorted out and hope it helps.
>>>>>>>>>>> Thanks!
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Kai
>>>>>>>>>>> 
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: Gangumalla, Uma [mailto:uma.ganguma...@intel.com]
>>>>>>>>>>> Sent: Wednesday, September 23, 2015 8:50 AM
>>>>>>>>>>> To: hdfs-dev@hadoop.apache.org;
>>> common-...@hadoop.apache.org
>>>>>>>>>>> Subject: Re: [VOTE] Merge HDFS-7285 (erasure coding) branch
>>> to
>>>>>>> trunk
>>>>>>>>>>> 
>>>>>>>>>>> +1
>>>>>>>>>>> 
>>>>>>>>>>> Great addition to HDFS. Thanks all contributors for the nice
>>>>>>> work.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Uma
>>>>>>>>>>> 
>>>>>>>>>>> On 9/22/15, 3:40 PM, "Zhe Zhang" <zhezh...@cloudera.com>
>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> 
>>>>>>>>>>>> I'd like to propose a vote to merge the HDFS-7285 feature
>>>>>>> branch
>>>>>>>>>>>> back to trunk. Since November 2014 we have been designing
>>> and
>>>>>>>>>>>> developing this feature under the umbrella JIRAs HDFS-7285
>>>> and
>>>>>>>>>>>> HADOOP-11264, and have committed approximately 210 patches.
>>>>>>>>>>>> 
>>>>>>>>>>>> The HDFS-7285 feature branch was created to support the
>>> first
>>>>>>> phase
>>>>>>>>>>>> of HDFS erasure coding (HDFS-EC). The objective of HDFS-EC
>>> is
>>>>>>> to
>>>>>>>>>>>> significantly reduce storage space usage in HDFS clusters.
>>>>>>> Instead
>>>>>>>>>>>> of always creating 3 replicas of each block with 200%
>>> storage
>>>>>>> space
>>>>>>>>>>>> overhead, HDFS-EC provides data durability through parity
>>>> data
>>>>>>>> blocks.
>>>>>>>>>>>> With most EC configurations, the storage overhead is no
>>> more
>>>>>>> than
>>>>>>>> 50%.
>>>>>>>>>>>> Based on profiling results of production clusters, we
>>> decided
>>>>>>> to
>>>>>>>>>>>> support EC with the striped block layout in the first
>>> phase,
>>>> so
>>>>>>>>>>>> that small files can be better handled. This means dividing
>>>>>>> each
>>>>>>>>>>>> logical HDFS file block into smaller units (striping cells)
>>>> and
>>>>>>>>>>>> spreading them on a set of DataNodes in round-robin
>>> fashion.
>>>>>>> Parity
>>>>>>>>>>>> cells are generated for each stripe of original data cells.
>>>> We
>>>>>>> have
>>>>>>>>>>>> made changes to NameNode, client, and DataNode to
>>> generalize
>>>>>>> the
>>>>>>>>>>>> block concept and handle the mapping between a logical file
>>>>>>> block
>>>>>>>>>>>> and its internal storage blocks. For further details please
>>>> see
>>>>>>> the
>>>>>>>>>>>> design doc on HDFS-7285.
>>>>>>>>>>>> HADOOP-11264 focuses on providing flexible and
>>>> high-performance
>>>>>>>>>>>> codec calculation support.
>>>>>>>>>>>> 
>>>>>>>>>>>> The nightly Jenkins job of the branch has reported several
>>>>>>>>>>>> successful runs, and doesn't show new flaky tests compared
>>>> with
>>>>>>>>>>>> trunk. We have posted several versions of the test plan
>>>>>>> including
>>>>>>>>>>>> both unit testing and cluster testing, and have executed
>>> most
>>>>>>> tests
>>>>>>>>>>>> in the plan. The most basic functionalities have been
>>>>>>> extensively
>>>>>>>>>>>> tested and verified in several real clusters with different
>>>>>>>>>>>> hardware configurations; results have been very stable. We
>>>> have
>>>>>>>>>>>> created follow-on tasks for more advanced error handling
>>> and
>>>>>>>>> optimization under the umbrella HDFS-8031.
>>>>>>>>>>>> We also plan to implement or harden the integration of EC
>>>> with
>>>>>>>>>>>> existing features such as WebHDFS, snapshot, append,
>>>> truncate,
>>>>>>>>>>>> hflush, hsync, and so forth.
>>>>>>>>>>>> 
>>>>>>>>>>>> Development of this feature has been a collaboration across
>>>>>>> many
>>>>>>>>>>>> companies and institutions. I'd like to thank J. Andreina,
>>>>>>> Takanobu
>>>>>>>>>>>> Asanuma, Vinayakumar B, Li Bo, Takuya Fukudome, Uma
>>> Maheswara
>>>>>>> Rao
>>>>>>>>>>>> G, Rui Li, Yi Liu, Colin McCabe, Xinwei Qin, Rakesh R, Gao
>>>> Rui,
>>>>>>> Kai
>>>>>>>>>>>> Sasaki, Walter Su, Tsz Wo Nicholas Sze, Andrew Wang, Yong
>>>>>>> Zhang,
>>>>>>>>>>>> Jing Zhao, Hui Zheng and Kai Zheng for their code
>>>> contributions
>>>>>>> and
>>>>>>>>> reviews.
>>>>>>>>>>>> Andrew and Kai Zheng also made fundamental contributions to
>>>> the
>>>>>>>>>>>> initial design. Rui Li, Gao Rui, Kai Sasaki, Kai Zheng and
>>>> many
>>>>>>>>>>>> other contributors have made great efforts in system
>>> testing.
>>>>>>> Many
>>>>>>>>>>>> thanks go to Weihua Jiang for proposing the JIRA, and ATM,
>>>> Todd
>>>>>>>>>>>> Lipcon, Silvius Rus, Suresh, as well as many others for
>>>>>>> providing
>>>>>>>>> helpful feedbacks.
>>>>>>>>>>>> 
>>>>>>>>>>>> Following the community convention, this vote will last
>>> for 7
>>>>>>> days
>>>>>>>>>>>> (ending September 29th). Votes from Hadoop committers are
>>>>>>> binding
>>>>>>>>>>>> but non-binding votes are very welcome as well. And here's
>>> my
>>>>>>>>>>>> non-binding
>>>>>>>>>> +1.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> ---
>>>>>>>>>>>> Zhe Zhang
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>

Re: Erasure coding in branch-2 [Was Re: [VOTE] Merge HDFS-7285 (erasure coding) branch to trunk]

Reply via email to