w.r.t. hadoop-2 release, see this thread: http://search-hadoop.com/m/YSTny19y1Ha1/hadoop+2.2.0
Looks like 2.2.0-beta would pass votes. Cheers On Mon, Oct 14, 2013 at 7:24 PM, Mike Drob <[email protected]> wrote: > Responses Inline. > > - Mike > > On Mon, Oct 14, 2013 at 12:55 PM, Sean Busbey <[email protected]> wrote: > > > Hey All, > > > > I'd like to restart the conversation from end July / start August about > > Hadoop 2 support on the 1.4 branch. > > > > Specifically, I'd like to get some requirements ironed out so I can file > > one or more jiras. I'd also like to get a plan for application. > > > > =requirements > > > > Here's the requirements I have from the last thread: > > > > 1) Maintain existing 1.4 compatibility > > > > The only thing I see listed in the pom is Apache release 0.20.203.0. > (1.4.4 > > tag)[1] > > > > I don't see anything in the README[2] nor the user manual[3] on other > > versions being supported. > > > > Yep. > > > > 2) Gain Hadoop 2 support > > > > At the moment, I'm presuming this means Apache release 2.0.4-alpha since > > that's what 1.5.0 builds against for Hadoop 2. > > > > I haven't been following the Hadoop 2 release schedule that closely, but > I > think the latest is a 2.1.0-beta? Pretty sure it was released after we > finished Accumulo 1.5, so there's no reason not to support it in my mind. > Depending on an "alpha" of something strikes me as either unstable or lazy, > although I fully understand that it may be neither. > > > > 3) Test for correctness on given versions, with >= 5 node cluster > > > > * Unit Tests > > * Functional Tests > > * 24hr continuous + verification > > * 24hr continuous + verification + agitation > > * 24hr random walk > > * 24hr random walk + agitation > > > > Keith mentioned running these against a CDH4 cluster, but I presume that > > since Apache Releases are our stated compatibilities it would actually be > > against whatever versions we list. Based on #1 and #2 above, I would > expect > > that to be Apache Hadoop 0.20.203.0 and Apache Hadoop 2.0.4-alpha. > > > > Hadoop 2 introduces some neat new things like NN HA, which I think it > might be worthwhile to test with. At that level it might be more of a > verification of the Hadoop code, but I'd like to be comfortable that our > DFS Clients switch correctly. This is in addition to the standard release > suite that we run. [1] > > [1]: http://accumulo.apache.org/governance/releasing.html#testing > > > > 4) Binary packaging > > 4a) Either source produces a single binary for all accepted versions > > > > or > > > > 4b) Instructions for building from source for each versions and somehow > > flag what (if any) convenience binaries are made for the release. > > > > > Having run the binary packaging for 1.4.4, I can tell you that it is not in > great shape. Christopher cleaned up a lot of the issues in the 1.5 line, so > I didn't bother spending a ton of time on them here, but I think RPM and > DEB are both broken. It would be nice to be able to specify a Hadoop 2 > version for compilation, similar to what happens in the newer code base, > which could be back ported, I suppose. 4b seems easier. > > =application > > > > There will be many back-ported patches. Not much active development > happens > > on 1.4.x now, but I presume this should still all go onto a feature > branch? > > > > Is the community preference that eventually all the changes become a > single > > commit (or one-per-subtask if there are multiple jiras) on the active 1.4 > > development branch, or that the original patches remain broken out? > > > > Not sure what you mean by this. > > > > For what it's worth, I'd recommend keeping them broken out. (And that's > how > > the initial development against CDH4 has been done.) > > > > > > [1] http://bit.ly/1fxucMe > > [2] http://bit.ly/192zUAJ > > [3] > > > http://accumulo.apache.org/1.4/user_manual/Administration.html#Dependencies > > > > -- > > Sean > > >
