Thichphuoctien On Mar 9, 2015 3:35 PM, "Andrew Wang" <andrew.w...@cloudera.com> wrote:
> Hi Mayank, > > Note that Hadoop 3 does not mean the end of updates for Hadoop 2.x, which > will keep supporting JDK7 for a while yet. Someone on the original thread > also proposed keeping Hadoop 3 JDK7-source compatible to make backports to > 2.x easier. I support this. > > Note also that the jump from Hadoop 1 to Hadoop 2 (which is what I assume > was your previous migration) is a far, far more impactful change than what > is being proposed for Hadoop 3. Hadoop 3 will look basically like a 2.x > release except for the JDK8 bump and classpath isolation. The intent is to > otherwise maintain wire and API compatibility. > > Overall your timeline sounds like it fits the schedule I proposed. If we > release a 3.0 GA this year, it means you can upgrade to a baked 3.1 or 3.2 > next year. Seems like a sound upgrade procedure for a large cluster. > > Best, > Andrew > > On Mon, Mar 9, 2015 at 2:24 PM, Mayank Bansal <maban...@gmail.com> wrote: > > > Hi Guys, > > > > From my prospective @ ebay we are not going to upgrade to JDK 8 any time > > soon we just upgraded to 7 and not want to move further at least this > year > > so I will request you guys not to drop the support for JDK 7 as that > would > > be very crucial for us to move forward. > > > > We also just completed our Hadoop 2 migration for all clusters this year > > which we started earlier last year, so I don't think we can do again > major > > upgrades this year. Stabilizing the major releases takes lots of effort > and > > time, I think Hadoop 3.x makes sense at least for us next year. > > > > Thanks, > > > > Mayank > > > > On Mon, Mar 9, 2015 at 12:29 AM, Arun Murthy <a...@hortonworks.com> > wrote: > > > > > Over the last few days, we have had lots of discussions that have > > > intertwined several major themes: > > > > > > > > > > > > # When/why do we make major Hadoop releases? > > > > > > # When/how do we move to major JDK versions? > > > > > > # To a lesser extent, we have debated another theme: what do we do > about > > > trunk? > > > > > > > > > > > > For now, let's park JDK & trunk to treat them in a separate thread(s). > > > > > > > > > > > > For a while now, I've had a couple of lampposts in my head which I used > > > for guidance - apologize for not sharing this broadly prior to this > > > discussion, maybe putting it out here will help - certainly hope so. > > > > > > > > > > > > > > > > > > Major Releases > > > > > > > > > > > > Hadoop continues to benefit tremendously by the investment in > stability, > > > validation etc. put in by its *anchor* users: Yahoo, Facebook, Twitter, > > > eBay, LinkedIn etc. > > > > > > > > > > > > A historical perspective... > > > > > > > > > > > > In it's lifetime, Apache Hadoop went from monthly to quarterly releases > > > because, as Hadoop became more and more of a production system > (starting > > > with hadoop-0.16 and more so with hadoop 0.18), users could not absorb > > the > > > torrid pace of change. > > > > > > > > > > > > IMHO, we didn't go far enough in addressing the competing pressures of > > > stability v/s rapid innovation. We paid for it by losing one of our > > anchor > > > users - Facebook - around the time of hadoop-0.19 - they just forked. > > > > > > > > > > > > Around the same time, Yahoo hit the same problem (I know, I lived > through > > > it painfully) and got stuck with hadoop-0.20 for a *very* long time and > > > forked to add Security rather than deal with the next major release > > > (hadoop-0.21). Later on, Facebook did the same, and, unfortunately for > > the > > > community, is stuck - probably forever - on their fork of hadoop-0.20. > > > > > > > > > > > > Overall, these were dark days for the community: every anchor user was > on > > > their own fork, and it took a toll on the project. > > > > > > > > > > > > Recently, thankfully for Hadoop, we have had a period of relative > > > stability with hadoop-1.x and hadoop-2.x. Even so, there were close > > shaves: > > > Yahoo was on hadoop-0.23 for a *very* long time - in fact, they are > only > > > just now finishing their migration to hadoop-2.x. > > > > > > > > > > > > I think the major lessons here are the obvious ones: > > > > > > > > > > > > # Compatibility matters > > > > > > # Maintaining ?multiple major releases, in parallel, is a big problem - > > it > > > leads to an unproductive, and risky, split in community investment > along > > > different lines. > > > > > > > > > > > > > > > > > > Looking Ahead > > > > > > > > > > > > Given the above, here are some thoughts for looking ahead: > > > > > > > > > > > > # Be very conservative about major releases - a major benefit is > required > > > (features) for the cost. Let's not compel our anchor users like Yahoo, > > > Twitter, eBay, and LinkedIn to invest in previous releases rather than > > the > > > latest one. Let's hear more from them - and let's be very accommodating > > to > > > them - for they play a key role in keeping Hadoop healthy & stable. > > > > > > > > > > > > # Be conservative about dropping support for JDKs. In particular, let's > > > hear from our anchor users on their plans for adoption jdk-1.8. > LinkedIn > > > has already moved to jdk-1.8, which is great for the validation , but > > let's > > > wait for the rest of our anchor users to move before we drop jdk-1.7. > We > > > did the same thing with jdk-1.6 - waited for them to move before we > drop > > > support for jdk-1.7. > > > > > > > > > > > > Overall, I'd love to hear more from Twitter, Yahoo, eBay and other > anchor > > > users on their plans for jdk-1.8 specifically, and on their overall > > > appetite for hadoop-3. Let's not finalize our plans for moving forward > > > until this input has been considered. > > > > > > > > > > > > Thoughts? > > > > > > > > > thanks, > > > Arun > > > > > > > > > > > > Unfortunate that it's necessary disclaimers: > > > > > > # Before people point out vendor affiliations to lend unnecessary color > > to > > > my opinions, let me state that hadoop-2 v/s hadoop-3 is a non-issue for > > us. > > > For major HDP versions the key is, just, compatibility?... e.g. we ship > > > major, but compatible, community releases such as hive-0.13/hive-0.14 > in > > > HDP-2.x/HDP-2.x+1 etc. > > > > > > # Also, release management is a similar non-issue - we have already had > > > several individuals step up in hadoop-2.x line. Expect more of the same > > > from folks like Andrew, Karthik, Vinod, Steve etc. > > > > > > > > > > > -- > > Thanks and Regards, > > Mayank > > Cell: 408-718-9370 > > >