Thanks Chris.

+1 for reverting form 2.7. This is the least we should do. Can you help doing 
the needful?

I personally am not completely sold on a release with *only* the layout 
changes. Like I was saying before, we can let specific users backport this into 
specific 2.x branches they need and leave it only on trunk / branch-2. That 
said, I would love to hear others’ thoughts on this, but let’s fork that 
discussion off from this 2.7.3 thread. Re a fresh 2.8, I have renewed my 
efforts on 2.8 with a view of cutting an RC in weeks. Not sure if that does or 
doesn’t help this discussion.

Thanks
+Vinod

> On Apr 5, 2016, at 2:03 PM, Chris Trezzo <ctre...@gmail.com> wrote:
> 
> In light of the additional conversation on HDFS-8791, I would like to
> re-propose the following:
> 
> 1. Revert the new datanode layout (HDFS-8791) from the 2.7 branch. The
> layout change currently does not support downgrades which breaks our
> upgrade/downgrade policies for dot releases.
> 
> 2. Cut a 2.8 release off of the 2.7.3 release with the addition of
> HDFS-8791. This would give customers a stable release that they could
> deploy with the new layout. As discussed on the jira, this is still in line
> with user expectation for minor releases as we have done layout changes in
> a number of 2.x minor releases already. The current 2.8 would become 2.9
> and continue its current release schedule.
> 
> What does everyone think? If unsupported downgrades between minor releases
> is still not agreeable, then as stated by Vinod, we would need to either
> add support for downgrades with dn layout changes or revert the layout
> change from branch-2. If we are OK with the layout change in a minor
> release, but think that the issue does not affect enough customers to
> warrant a separate release, we could simply leave it in branch-2 and let it
> be released with the current 2.8.
> 
> 
> On Mon, Apr 4, 2016 at 1:48 PM, Vinod Kumar Vavilapalli <vino...@apache.org>
> wrote:
> 
>> I commented on the JIRA way back (see
>> https://issues.apache.org/jira/browse/HDFS-8791?focusedCommentId=15036666&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15036666),
>> saying what I said below. Unfortunately, I haven’t followed the patch along
>> after my initial comment.
>> 
>> This isn’t about any specific release - starting 2.6 we declared support
>> for rolling upgrades and downgrades. Any patch that breaks this should not
>> be in branch-2.
>> 
>> Two options from where I stand
>> (1) For folks who worked on the patch: Is there a way to make (a) the
>> upgrade-downgrade seamless for people who don’t care about this (b) and
>> have explicit documentation for people who care to switch this behavior on
>> and are willing to risk not having downgrades. If this means a new
>> configuration property, so be it. It’s a necessary evil.
>> (2) Just let specific users backport this into specific 2.x branches they
>> need and leave it only on trunk.
>> 
>> Unless this behavior stops breaking rolling upgrades/downgrades, I think
>> we should just revert it from branch-2 and definitely 2.7.3 as it stands
>> today.
>> 
>> +Vinod
>> 
>> 
>>> On Apr 1, 2016, at 2:54 PM, Chris Trezzo <ctre...@gmail.com> wrote:
>>> 
>>> A few thoughts:
>>> 
>>> 1. To echo Andrew Wang, HDFS-8578 (parallel upgrades) should be a
>>> prerequisite for HDFS-8791. Without that patch, upgrades can be very slow
>>> for data nodes depending on your setup.
>>> 
>>> 2. We have already deployed this patch internally so, with my Twitter hat
>>> on, I would be perfectly happy as long as it makes it into trunk and 2.8.
>>> That being said, I would be hesitant to deploy the current 2.7.x or 2.6.x
>>> releases on a large production cluster that has a diverse set of block
>> ids
>>> without this patch, especially if your data nodes have a large number of
>>> disks or you are using federation. To be clear though: this highly
>> depends
>>> on your setup and at a minimum you should verify that this regression
>> will
>>> not affect you. The current block-id based layout in 2.6.x and 2.7.2 has
>> a
>>> performance regression that gets worse over time. When you see it
>> happening
>>> on a live cluster, it is one of the harder issues to identify a root
>> cause
>>> and debug. I do understand that this is currently only affecting a
>> smaller
>>> number of users, but I also think this number has potential to increase
>> as
>>> time goes on. Maybe we can issue a warning in the release notes for
>> future
>>> 2.7.x and 2.6.x releases?
>>> 
>>> 3. One option (this was suggested on HDFS-8791 and I think Sean alluded
>> to
>>> this proposal on this thread) would be to cut a 2.8 release off of the
>>> 2.7.3 release with the new layout. What people currently think of as 2.8
>>> would then become 2.9. This would give customers a stable release that
>> they
>>> could deploy with the new layout and would not break upgrade and
>> downgrade
>>> expectations.
>>> 
>>> On Fri, Apr 1, 2016 at 11:32 AM, Andrew Purtell <apurt...@apache.org>
>> wrote:
>>> 
>>>> As a downstream consumer of Apache Hadoop 2.7.x releases, I expect we
>> would
>>>> patch the release to revert HDFS-8791 before pushing it out to
>> production.
>>>> For what it's worth.
>>>> 
>>>> 
>>>> On Fri, Apr 1, 2016 at 11:23 AM, Andrew Wang <andrew.w...@cloudera.com>
>>>> wrote:
>>>> 
>>>>> One other thing I wanted to bring up regarding HDFS-8791, we haven't
>>>>> backported the parallel DN upgrade improvement (HDFS-8578) to
>> branch-2.6.
>>>>> HDFS-8578 is a very important related fix since otherwise upgrade will
>> be
>>>>> very slow.
>>>>> 
>>>>> On Thu, Mar 31, 2016 at 10:35 AM, Andrew Wang <
>> andrew.w...@cloudera.com>
>>>>> wrote:
>>>>> 
>>>>>> As I expressed on HDFS-8791, I do not want to include this JIRA in a
>>>>>> maintenance release. I've only seen it crop up on a handful of our
>>>>>> customer's clusters, and large users like Twitter and Yahoo that seem
>>>> to
>>>>> be
>>>>>> more affected are also the most able to patch this change in
>>>> themselves.
>>>>>> 
>>>>>> Layout upgrades are quite disruptive, and I don't think it's worth
>>>>>> breaking upgrade and downgrade expectations when it doesn't affect the
>>>>> (in
>>>>>> my experience) vast majority of users.
>>>>>> 
>>>>>> Vinod seemed to have a similar opinion in his comment on HDFS-8791,
>> but
>>>>>> will let him elaborate.
>>>>>> 
>>>>>> Best,
>>>>>> Andrew
>>>>>> 
>>>>>> On Thu, Mar 31, 2016 at 9:11 AM, Sean Busbey <bus...@cloudera.com>
>>>>> wrote:
>>>>>> 
>>>>>>> As of 2 days ago, there were already 135 jiras associated with 2.7.3,
>>>>>>> if *any* of them end up introducing a regression the inclusion of
>>>>>>> HDFS-8791 means that folks will have cluster downtime in order to
>> back
>>>>>>> things out. If that happens to any substantial number of downstream
>>>>>>> folks, or any particularly vocal downstream folks, then it is very
>>>>>>> likely we'll lose the remaining trust of operators for rolling out
>>>>>>> maintenance releases. That's a pretty steep cost.
>>>>>>> 
>>>>>>> Please do not include HDFS-8791 in any 2.6.z release. Folks having to
>>>>>>> be aware that an upgrade from e.g. 2.6.5 to 2.7.2 will fail is an
>>>>>>> unreasonable burden.
>>>>>>> 
>>>>>>> I agree that this fix is important, I just think we should either cut
>>>>>>> a version of 2.8 that includes it or find a way to do it that gives
>> an
>>>>>>> operational path for rolling downgrade.
>>>>>>> 
>>>>>>> On Thu, Mar 31, 2016 at 10:10 AM, Junping Du <j...@hortonworks.com>
>>>>> wrote:
>>>>>>>> Thanks for bringing up this topic, Sean.
>>>>>>>> When I released our latest Hadoop release 2.6.4, the patch of
>>>>> HDFS-8791
>>>>>>> haven't been committed in so that's why we didn't discuss this
>>>> earlier.
>>>>>>>> I remember in JIRA discussion, we treated this layout change as a
>>>>>>> Blocker bug that fixing a significant performance regression before
>>>> but
>>>>> not
>>>>>>> a normal performance improvement. And I believe HDFS community
>> already
>>>>> did
>>>>>>> their best with careful and patient to deliver the fix and other
>>>> related
>>>>>>> patches (like upgrade fix in HDFS-8578). Take an example of
>> HDFS-8578,
>>>>> you
>>>>>>> can see 30+ rounds patch review back and forth by senior committers,
>>>>> not to
>>>>>>> mention the outstanding performance test data in HDFS-8791.
>>>>>>>> I would trust our HDFS committers' judgement to land HDFS-8791 on
>>>>>>> 2.7.3. However, that needs Vinod's final confirmation who serves as
>> RM
>>>>> for
>>>>>>> branch-2.7. In addition, I didn't see any blocker issue to bring it
>>>> into
>>>>>>> 2.6.5 now.
>>>>>>>> Just my 2 cents.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> 
>>>>>>>> Junping
>>>>>>>> 
>>>>>>>> ________________________________________
>>>>>>>> From: Sean Busbey <bus...@cloudera.com>
>>>>>>>> Sent: Thursday, March 31, 2016 2:57 PM
>>>>>>>> To: hdfs-...@hadoop.apache.org
>>>>>>>> Cc: Hadoop Common; yarn-...@hadoop.apache.org;
>>>>>>> mapreduce-...@hadoop.apache.org
>>>>>>>> Subject: Re: 2.7.3 release plan
>>>>>>>> 
>>>>>>>> A layout change in a maintenance release sounds very risky. I saw
>>>> some
>>>>>>>> discussion on the JIRA about those risks, but the consensus seemed
>>>> to
>>>>>>>> be "we'll leave it up to the 2.6 and 2.7 release managers." I
>>>> thought
>>>>>>>> we did RMs per release rather than per branch? No one claiming to
>>>> be a
>>>>>>>> release manager ever spoke up AFAICT.
>>>>>>>> 
>>>>>>>> Should this change be included? Should it go into a special 2.8
>>>>>>>> release as mentioned in the ticket?
>>>>>>>> 
>>>>>>>> On Thu, Mar 31, 2016 at 1:45 AM, Akira AJISAKA
>>>>>>>> <ajisa...@oss.nttdata.co.jp> wrote:
>>>>>>>>> Thank you Vinod!
>>>>>>>>> 
>>>>>>>>> FYI: 2.7.3 will be a bit special release.
>>>>>>>>> 
>>>>>>>>> HDFS-8791 bumped up the datanode layout version,
>>>>>>>>> so rolling downgrade from 2.7.3 to 2.7.[0-2]
>>>>>>>>> is impossible. We can rollback instead.
>>>>>>>>> 
>>>>>>>>> https://issues.apache.org/jira/browse/HDFS-8791
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Akira
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On 3/31/16 08:18, Vinod Kumar Vavilapalli wrote:
>>>>>>>>>> 
>>>>>>>>>> Hi all,
>>>>>>>>>> 
>>>>>>>>>> Got nudged about 2.7.3. Was previously waiting for 2.6.4 to go out
>>>>>>> (which
>>>>>>>>>> did go out mid February). Got a little busy since.
>>>>>>>>>> 
>>>>>>>>>> Following up the 2.7.2 maintenance release, we should work
>>>> towards a
>>>>>>>>>> 2.7.3. The focus obviously is to have blocker issues [1],
>>>> bug-fixes
>>>>>>> and *no*
>>>>>>>>>> features / improvements.
>>>>>>>>>> 
>>>>>>>>>> I hope to cut an RC in a week - giving enough time for outstanding
>>>>>>> blocker
>>>>>>>>>> / critical issues. Will start moving out any tickets that are not
>>>>>>> blockers
>>>>>>>>>> and/or won’t fit the timeline - there are 3 blockers and 15
>>>> critical
>>>>>>> tickets
>>>>>>>>>> outstanding as of now.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> +Vinod
>>>>>>>>>> 
>>>>>>>>>> [1] 2.7.3 release blockers:
>>>>>>>>>> https://issues.apache.org/jira/issues/?filter=12335343
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> busbey
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> busbey
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best regards,
>>>> 
>>>>  - Andy
>>>> 
>>>> Problems worthy of attack prove their worth by hitting back. - Piet Hein
>>>> (via Tom White)
>>>> 
>> 
>> 

Reply via email to