A few thoughts:

1. To echo Andrew Wang, HDFS-8578 (parallel upgrades) should be a
prerequisite for HDFS-8791. Without that patch, upgrades can be very slow
for data nodes depending on your setup.

2. We have already deployed this patch internally so, with my Twitter hat
on, I would be perfectly happy as long as it makes it into trunk and 2.8.
That being said, I would be hesitant to deploy the current 2.7.x or 2.6.x
releases on a large production cluster that has a diverse set of block ids
without this patch, especially if your data nodes have a large number of
disks or you are using federation. To be clear though: this highly depends
on your setup and at a minimum you should verify that this regression will
not affect you. The current block-id based layout in 2.6.x and 2.7.2 has a
performance regression that gets worse over time. When you see it happening
on a live cluster, it is one of the harder issues to identify a root cause
and debug. I do understand that this is currently only affecting a smaller
number of users, but I also think this number has potential to increase as
time goes on. Maybe we can issue a warning in the release notes for future
2.7.x and 2.6.x releases?

3. One option (this was suggested on HDFS-8791 and I think Sean alluded to
this proposal on this thread) would be to cut a 2.8 release off of the
2.7.3 release with the new layout. What people currently think of as 2.8
would then become 2.9. This would give customers a stable release that they
could deploy with the new layout and would not break upgrade and downgrade
expectations.

On Fri, Apr 1, 2016 at 11:32 AM, Andrew Purtell <apurt...@apache.org> wrote:

> As a downstream consumer of Apache Hadoop 2.7.x releases, I expect we would
> patch the release to revert HDFS-8791 before pushing it out to production.
> For what it's worth.
>
>
> On Fri, Apr 1, 2016 at 11:23 AM, Andrew Wang <andrew.w...@cloudera.com>
> wrote:
>
> > One other thing I wanted to bring up regarding HDFS-8791, we haven't
> > backported the parallel DN upgrade improvement (HDFS-8578) to branch-2.6.
> > HDFS-8578 is a very important related fix since otherwise upgrade will be
> > very slow.
> >
> > On Thu, Mar 31, 2016 at 10:35 AM, Andrew Wang <andrew.w...@cloudera.com>
> > wrote:
> >
> > > As I expressed on HDFS-8791, I do not want to include this JIRA in a
> > > maintenance release. I've only seen it crop up on a handful of our
> > > customer's clusters, and large users like Twitter and Yahoo that seem
> to
> > be
> > > more affected are also the most able to patch this change in
> themselves.
> > >
> > > Layout upgrades are quite disruptive, and I don't think it's worth
> > > breaking upgrade and downgrade expectations when it doesn't affect the
> > (in
> > > my experience) vast majority of users.
> > >
> > > Vinod seemed to have a similar opinion in his comment on HDFS-8791, but
> > > will let him elaborate.
> > >
> > > Best,
> > > Andrew
> > >
> > > On Thu, Mar 31, 2016 at 9:11 AM, Sean Busbey <bus...@cloudera.com>
> > wrote:
> > >
> > >> As of 2 days ago, there were already 135 jiras associated with 2.7.3,
> > >> if *any* of them end up introducing a regression the inclusion of
> > >> HDFS-8791 means that folks will have cluster downtime in order to back
> > >> things out. If that happens to any substantial number of downstream
> > >> folks, or any particularly vocal downstream folks, then it is very
> > >> likely we'll lose the remaining trust of operators for rolling out
> > >> maintenance releases. That's a pretty steep cost.
> > >>
> > >> Please do not include HDFS-8791 in any 2.6.z release. Folks having to
> > >> be aware that an upgrade from e.g. 2.6.5 to 2.7.2 will fail is an
> > >> unreasonable burden.
> > >>
> > >> I agree that this fix is important, I just think we should either cut
> > >> a version of 2.8 that includes it or find a way to do it that gives an
> > >> operational path for rolling downgrade.
> > >>
> > >> On Thu, Mar 31, 2016 at 10:10 AM, Junping Du <j...@hortonworks.com>
> > wrote:
> > >> > Thanks for bringing up this topic, Sean.
> > >> > When I released our latest Hadoop release 2.6.4, the patch of
> > HDFS-8791
> > >> haven't been committed in so that's why we didn't discuss this
> earlier.
> > >> > I remember in JIRA discussion, we treated this layout change as a
> > >> Blocker bug that fixing a significant performance regression before
> but
> > not
> > >> a normal performance improvement. And I believe HDFS community already
> > did
> > >> their best with careful and patient to deliver the fix and other
> related
> > >> patches (like upgrade fix in HDFS-8578). Take an example of HDFS-8578,
> > you
> > >> can see 30+ rounds patch review back and forth by senior committers,
> > not to
> > >> mention the outstanding performance test data in HDFS-8791.
> > >> > I would trust our HDFS committers' judgement to land HDFS-8791 on
> > >> 2.7.3. However, that needs Vinod's final confirmation who serves as RM
> > for
> > >> branch-2.7. In addition, I didn't see any blocker issue to bring it
> into
> > >> 2.6.5 now.
> > >> > Just my 2 cents.
> > >> >
> > >> > Thanks,
> > >> >
> > >> > Junping
> > >> >
> > >> > ________________________________________
> > >> > From: Sean Busbey <bus...@cloudera.com>
> > >> > Sent: Thursday, March 31, 2016 2:57 PM
> > >> > To: hdfs-...@hadoop.apache.org
> > >> > Cc: Hadoop Common; yarn-...@hadoop.apache.org;
> > >> mapreduce-...@hadoop.apache.org
> > >> > Subject: Re: 2.7.3 release plan
> > >> >
> > >> > A layout change in a maintenance release sounds very risky. I saw
> some
> > >> > discussion on the JIRA about those risks, but the consensus seemed
> to
> > >> > be "we'll leave it up to the 2.6 and 2.7 release managers." I
> thought
> > >> > we did RMs per release rather than per branch? No one claiming to
> be a
> > >> > release manager ever spoke up AFAICT.
> > >> >
> > >> > Should this change be included? Should it go into a special 2.8
> > >> > release as mentioned in the ticket?
> > >> >
> > >> > On Thu, Mar 31, 2016 at 1:45 AM, Akira AJISAKA
> > >> > <ajisa...@oss.nttdata.co.jp> wrote:
> > >> >> Thank you Vinod!
> > >> >>
> > >> >> FYI: 2.7.3 will be a bit special release.
> > >> >>
> > >> >> HDFS-8791 bumped up the datanode layout version,
> > >> >> so rolling downgrade from 2.7.3 to 2.7.[0-2]
> > >> >> is impossible. We can rollback instead.
> > >> >>
> > >> >> https://issues.apache.org/jira/browse/HDFS-8791
> > >> >>
> > >>
> >
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
> > >> >>
> > >> >> Regards,
> > >> >> Akira
> > >> >>
> > >> >>
> > >> >> On 3/31/16 08:18, Vinod Kumar Vavilapalli wrote:
> > >> >>>
> > >> >>> Hi all,
> > >> >>>
> > >> >>> Got nudged about 2.7.3. Was previously waiting for 2.6.4 to go out
> > >> (which
> > >> >>> did go out mid February). Got a little busy since.
> > >> >>>
> > >> >>> Following up the 2.7.2 maintenance release, we should work
> towards a
> > >> >>> 2.7.3. The focus obviously is to have blocker issues [1],
> bug-fixes
> > >> and *no*
> > >> >>> features / improvements.
> > >> >>>
> > >> >>> I hope to cut an RC in a week - giving enough time for outstanding
> > >> blocker
> > >> >>> / critical issues. Will start moving out any tickets that are not
> > >> blockers
> > >> >>> and/or won’t fit the timeline - there are 3 blockers and 15
> critical
> > >> tickets
> > >> >>> outstanding as of now.
> > >> >>>
> > >> >>> Thanks,
> > >> >>> +Vinod
> > >> >>>
> > >> >>> [1] 2.7.3 release blockers:
> > >> >>> https://issues.apache.org/jira/issues/?filter=12335343
> > >> >>>
> > >> >>
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > busbey
> > >>
> > >>
> > >>
> > >> --
> > >> busbey
> > >>
> > >
> > >
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
>

Reply via email to