I agree with Sandesh on following. The official branch from where releases are cut, shall continue taking EOL into consideration. However we also need to be prepared wrt future releases of Hadoop.
--prad On Wed, Jul 20, 2016 at 10:43 AM, Sandesh Hegde <[email protected]> wrote: > @Amol > > EOL is important for master branch. To start the work on next version of > Hadoop on different branch ( let us call that master++ ), we should not > worry about the EOL. Eventually, master++ becomes master and the master++ > will continue on the later version of the Hadoop. > > > > On Wed, Jul 20, 2016 at 10:30 AM Siyuan Hua <[email protected]> > wrote: > > > Ok, whether branches or forks. I still think we should have at least some > > materialized version of malhar/core for the big influencer like java, > > hadoop or even kafka. Java 8, for example, is actually not new. We don't > > have to be aggressive to try out new features from those right now. But > we > > can at least have some CI run build/test periodically and make sure our > > current code is future-prove and avoid some future-deprecated code when > we > > add new features. Also if people ask for it, we can have a link to point > > them to. BTW, High-level API can definitely benefit from java 8. :) > > > > Regards, > > Siyuan > > > > On Wed, Jul 20, 2016 at 8:30 AM, Sandesh Hegde <[email protected]> > > wrote: > > > > > Our current model of supporting the oldest supported Hadoop, penalizes > > the > > > users of latest Hadoop versions by favoring the slow movers. > > > Also, we won't benefit from the increased maturity of the Hadoop > > platform, > > > as we will be working on the many years old version of Hadoop. > > > We also need to incentivize our customers to upgrade their Hadoop > > version, > > > by making use of new features. > > > > > > My vote goes to start the work on the Hadoop 2.6 ( or any other > version ) > > > in a different branch, without waiting for the EOL policies. > > > > > > On Tue, Jul 12, 2016 at 1:16 AM Thomas Weise <[email protected]> > > > wrote: > > > > > > > -0 > > > > > > > > I read the thread twice, it is not clear to me what benefit Apex > users > > > > derive from this exercise. A branch normally contains development > work > > > that > > > > is eventually brought back to the main line and into a release. Here, > > the > > > > suggestion seems to be an open ended effort to play with latest tech, > > > isn't > > > > that something anyone (including a group of folks) can do in a fork. > I > > > > don't see value in a permanent branch for that, who is going to > > maintain > > > > such code and who will ever use it? > > > > > > > > There was a point that we can find out about potential problems with > > > later > > > > versions. The way to find such issues is to take the releases and run > > > them > > > > on these later versions (that's what users do), not by changing the > > code! > > > > > > > > Regarding Java version: Our users don't use Apex in a vacuum. Please > > > have a > > > > look at ASF Hadoop and the distros EOL policies. That will answer the > > > > question what Java version is appropriate. I would be surprised if > > > > something that works on Java 7 falls flat on the face with Java 8 as > a > > > lot > > > > of diligence goes into backward compatibility. Again the way to tests > > > this > > > > is to run verification with existing Apex releases on Java 8 based > > stack. > > > > > > > > Regarding Hadoop version: This has been discussed off record several > > > times > > > > and there are actual JIRA tickets marked accordingly so that the work > > is > > > > done when we move. It is a separate discussion, no need to mix Java > > > > versions and branching with it. I agree with what David said, if > > someone > > > > can show that we can move up to 2.6 based on EOL policies and what > > known > > > > Apex users have in production, then we should work on that upgrade. > The > > > way > > > > I imagine it would work is that we have a Hadoop-2.6 (or whatever > > > version) > > > > branch, make all the upgrade related changes there (which should be a > > > list > > > > of JIRAs) and then merge it back to master when we are satisfied. > After > > > > that, the branch can be deleted. > > > > > > > > Thomas > > > > > > > > > > > > > > > > On Tue, Jul 12, 2016 at 8:36 AM, Chinmay Kolhatkar < > > > > [email protected]> > > > > wrote: > > > > > > > > > I'm -0 on this idea. > > > > > > > > > > Here is the reason: > > > > > Unless we see a real case where users want to see everything on > > latest, > > > > > this branch might quickly become low hanging fruit and eventually > get > > > > > obsolete because its anyway a "no gaurantee" branch. > > > > > > > > > > We have a bunch of dependencies which we'll have to take care of to > > > > really > > > > > make it bleeding edge. Specially about malhar, its a long list. > That > > > > looks > > > > > like quite significant work. > > > > > Moreover, if this branch is going to be in "may or may not work" > > state; > > > > I, > > > > > as a user or developer, would bank on what certainly works. > > > > > > > > > > I also think that, if its going to be "no gaurantee" then its worth > > > > > spending time contributions towards master rather than > bleeding-edge > > > > > branch. > > > > > > > > > > If a question of "should we upgrade?" comes, the community is > mature > > to > > > > > take that call then and work accordingly. > > > > > > > > > > -Chinmay. > > > > > > > > > > > > > > > > > > > > On Tue, Jul 12, 2016 at 11:42 AM, Priyanka Gugale < > [email protected] > > > > > > > > wrote: > > > > > > > > > > > +1 for creating such branch. > > > > > > One of us will have to rebase it with master branch at > intervals. I > > > > don't > > > > > > think everyone will cherry-pick their commits here. We can make > it > > > once > > > > > in > > > > > > a month activity. Are we considering updating all dependency > > library > > > > > > version as well? > > > > > > > > > > > > -Priyanka > > > > > > > > > > > > On Tue, Jul 12, 2016 at 2:34 AM, Munagala Ramanath < > > > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > Following up on some comments, wanted to clarify what I have in > > > mind > > > > > for > > > > > > > this branch: > > > > > > > > > > > > > > 1. The main goal is to stay up-to-date with new releases, so > if a > > > > > > question > > > > > > > of the form > > > > > > > "A new release of X is available, should we upgrade ?" > comes > > > up, > > > > > the > > > > > > > answer is > > > > > > > *always* an *emphatic* yes; otherwise it doesn't bleed > enough > > > > (:-) > > > > > as > > > > > > > Sanjay points out. > > > > > > > 2. Pull requests are submitted as always; there is no > requirement > > > to > > > > > > > generate an additional > > > > > > > pull requests against this branch. It may get > > > > merged/cherry-picked > > > > > > > depending on who has the > > > > > > > time and inclination to do it. > > > > > > > 3. There is no expectation of dedication of any additional > > > resources, > > > > > so > > > > > > > people work on > > > > > > > it as and when time is available. ("No guarantee" means > > exactly > > > > > > that). > > > > > > > So there is no > > > > > > > question of "maintaining" this branch. > > > > > > > 4. This branch is not to be encumbered with legacy and/or > > backward > > > > > > > compatibility issues. > > > > > > > 5. This branch is not an experimental sandbox to try out new > > > > > algorithms, > > > > > > > architectural changes > > > > > > > and other such changes. > > > > > > > > > > > > > > As always, I'm open to other ideas, but that is what I had in > > mind > > > > > when I > > > > > > > made the suggestion. > > > > > > > > > > > > > > Ram > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 11, 2016 at 1:45 PM, Sanjay Pujare < > > > > [email protected] > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > As the name suggests the "bleeding-edge" branch ideally > should > > > use > > > > > > > bleeding > > > > > > > > edge versions so I would like to see Java 8 used there (and > > > Hadoop > > > > 3 > > > > > > when > > > > > > > > it does eventually come out) to make the maintenance effort > > > > > > worthwhile... > > > > > > > > > > > > > > > > On Mon, Jul 11, 2016 at 12:05 PM, David Yan < > > > [email protected] > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > I'm -0 on Java 8, but I'm +1 on the rest, and I'm > especially > > > > strong > > > > > > +1 > > > > > > > > for > > > > > > > > > upgrading the Hadoop dependency version. > > > > > > > > > > > > > > > > > > Here are my reasons: > > > > > > > > > > > > > > > > > > - Hadoop 3 will require Java 8, but Hadoop 2.7.2 still > > supports > > > > > Java > > > > > > 7 > > > > > > > > and > > > > > > > > > there will probably be some time (I'm guessing more than > one > > > > year) > > > > > > for > > > > > > > > > Hadoop 3 to become GA and for major distros to support > Hadoop > > > 3. > > > > > The > > > > > > > > > maintenance effort for having two branches, one for Java 7 > > and > > > > one > > > > > > for > > > > > > > > Java > > > > > > > > > 8 is not worth it at this time. > > > > > > > > > > > > > > > > > > - Apex currently uses Hadoop 2.2 dependencies, marked > > > "provided". > > > > > And > > > > > > > > > Hadoop 2.4 has been released more than two years ago, and > it > > > > added > > > > > a > > > > > > > lot > > > > > > > > of > > > > > > > > > features in the API that Apex can make use of. Most distros > > > > already > > > > > > > > bundle > > > > > > > > > Hadoop 2.6 or later. Although some old versions of Cloudera > > > that > > > > > > > include > > > > > > > > > hadoop version earlier than 2.4 still have not reached > > > > end-of-life > > > > > > yet, > > > > > > > > the > > > > > > > > > number of users using those old versions is probably very > > > small. > > > > > > > > > > > > > > > > > > David > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath < > > > > > > > [email protected]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > We've had a number of issues recently related to > > dependencies > > > > on > > > > > > old > > > > > > > > > > versions > > > > > > > > > > of various packages/libraries such as Hadoop itself, > Google > > > > > guava, > > > > > > > > > > HTTPClient, > > > > > > > > > > mbassador, etc. > > > > > > > > > > > > > > > > > > > > How about we create a "bleeding-edge" branch in both Core > > and > > > > > > Malhar > > > > > > > > > which > > > > > > > > > > will use the latest versions of these various > dependencies, > > > > > upgrade > > > > > > > to > > > > > > > > > Java > > > > > > > > > > 8 so > > > > > > > > > > we can use the new Java features, etc. ? > > > > > > > > > > > > > > > > > > > > This will give us an opportunity to discover these sorts > of > > > > > > problems > > > > > > > > > early > > > > > > > > > > and, > > > > > > > > > > when we are ready to pull the trigger for a major > version, > > we > > > > > have > > > > > > a > > > > > > > > > branch > > > > > > > > > > ready > > > > > > > > > > for merge with, hopefully, minimal additional effort. > > > > > > > > > > > > > > > > > > > > There will be no guarantees w.r.t. this branch so people > > > using > > > > it > > > > > > use > > > > > > > > it > > > > > > > > > at > > > > > > > > > > their own > > > > > > > > > > risk. > > > > > > > > > > > > > > > > > > > > Ram > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
