Please, don't add more Mavenization work on us (eventually I want to go back to coding)
Given that Hadoop is already Mavenized, the patch should be Mavenized. What will have to be done extra (besides Mavenizing distcp) is to create a hadoop-tools module at root level and within it a hadoop-distcp module. The hadoop-tools POM will look pretty much like the hadoop-common-project POM. The hadoop-distcp POM should follow the hadoop-common POM patterns. Thanks. Alejandro On Fri, Aug 26, 2011 at 9:37 AM, Amareshwari Sri Ramadasu < amar...@yahoo-inc.com> wrote: > Agree with Mithun and Robert. DistCp and Tools restructuring are separate > tasks. Since DistCp code is ready to be committed, it need not wait for the > Tools separation from MR/HDFS. > I would say it can go into contrib as the patch is now, and when the tools > restructuring happens it would be just an svn mv. If there are no issues > with this proposal I can commit the code tomorrow. > > Thanks > Amareshwari > > On 8/26/11 7:45 PM, "Robert Evans" <ev...@yahoo-inc.com> wrote: > > I agree with Mithun. They are related but this goes beyond distcpv2 and > should not block distcpv2 from going in. It would be very nice, however, to > get the layout settled soon so that we all know where to find something when > we want to work on it. > > Also +1 for Alejandro's I also prefer to keep tools at the trunk level. > > Even though HDFS, Common, and Mapreduce and perhaps soon tools are separate > modules right now, there is still tight coupling between the different > pieces, especially with tests. IMO until we can reduce that coupling we > should treat building and testing Hadoop as a single project instead of > trying to keep them separate. > > --Bobby > > On 8/26/11 7:45 AM, "Mithun Radhakrishnan" <mithun.radhakrish...@yahoo.com> > wrote: > > Would it be acceptable if retooling of tools/ were taken up separately? It > sounds to me like this might be a distinct (albeit related) task. > > Mithun > > > ________________________________ > From: Giridharan Kesavan <gkesa...@hortonworks.com> > To: mapreduce-dev@hadoop.apache.org > Sent: Friday, August 26, 2011 12:04 PM > Subject: Re: DistCpV2 in 0.23 > > +1 to Alejandro's > > I prefer to keep the hadoop-tools at trunk level. > > -Giri > > On Thu, Aug 25, 2011 at 9:15 PM, Alejandro Abdelnur <t...@cloudera.com> > wrote: > > I'd suggest putting hadoop-tools either at trunk/ level or having a a > tools > > aggregator module for hdfs and other for common. > > > > I personal would prefer at trunk/. > > > > Thanks. > > > > Alejandro > > > > On Thu, Aug 25, 2011 at 9:06 PM, Amareshwari Sri Ramadasu < > > amar...@yahoo-inc.com> wrote: > > > >> Agree. It should be separate maven module (and patch puts it as separate > >> maven module now). And top level for hadoop tools is nice to have, but > it > >> becomes hard to maintain until patch automation tests run the tests > under > >> tools. Currently we see many times the changes in HDFS effecting RAID > tests > >> in MapReduce. So, I'm fine putting the tools under hadoop-mapreduce. > >> > >> I propose we can have something like the following: > >> > >> trunk/ > >> - hadoop-mapreduce > >> - hadoop-mr-client > >> - hadoop-yarn > >> - hadoop-tools > >> - hadoop-streaming > >> - hadoop-archives > >> - hadoop-distcp > >> > >> Thoughts? > >> > >> @Eli and @JD, we did not replace old legacy distcp because this is > really a > >> complete rewrite and did not want to remove it until users are > familiarized > >> with new one. > >> > >> On 8/26/11 12:51 AM, "Todd Lipcon" <t...@cloudera.com> wrote: > >> > >> Maybe a separate toplevel for hadoop-tools? Stuff like RAID could go > >> in there as well - ie tools that are downstream of MR and/or HDFS. > >> > >> On Thu, Aug 25, 2011 at 12:09 PM, Mahadev Konar < > maha...@hortonworks.com> > >> wrote: > >> > +1 for a seperate module in hadoop-mapreduce-project. I think > >> > hadoop-mapreduce-client might not be right place for it. We might have > >> > to pick a new maven module under hadoop-mapreduce-project that could > >> > host streaming/distcp/hadoop archives. > >> > > >> > thanks > >> > mahadev > >> > > >> > On Thu, Aug 25, 2011 at 11:04 AM, Alejandro Abdelnur < > t...@cloudera.com> > >> wrote: > >> >> Agree, it should be a separate maven module. > >> >> > >> >> And it should be under hadoop-mapreduce-client, right? > >> >> > >> >> And now that we are in the topic, the same should go for streaming, > no? > >> >> > >> >> Thanks. > >> >> > >> >> Alejandro > >> >> > >> >> On Thu, Aug 25, 2011 at 10:58 AM, Todd Lipcon <t...@cloudera.com> > >> wrote: > >> >> > >> >>> On Thu, Aug 25, 2011 at 10:36 AM, Eli Collins <e...@cloudera.com> > >> wrote: > >> >>> > Nice work! I definitely think this should go in 23 and 20x. > >> >>> > > >> >>> > Agree with JD that it should be in the core code, not contrib. If > >> >>> > it's going to be maintained then we should put it in the core > code. > >> >>> > >> >>> Now that we're all mavenized, though, a separate maven module and > >> >>> artifact does make sense IMO - ie "hadoop jar > >> >>> hadoop-distcp-0.23.0-SNAPSHOT" rather than "hadoop distcp" > >> >>> > >> >>> -Todd > >> >>> -- > >> >>> Todd Lipcon > >> >>> Software Engineer, Cloudera > >> >>> > >> >> > >> > > >> > >> > >> > >> -- > >> Todd Lipcon > >> Software Engineer, Cloudera > >> > >> > > > > > > -- > -Giri > > >