Would it be acceptable if retooling of tools/ were taken up separately? It sounds to me like this might be a distinct (albeit related) task.
Mithun ________________________________ From: Giridharan Kesavan <gkesa...@hortonworks.com> To: mapreduce-dev@hadoop.apache.org Sent: Friday, August 26, 2011 12:04 PM Subject: Re: DistCpV2 in 0.23 +1 to Alejandro's I prefer to keep the hadoop-tools at trunk level. -Giri On Thu, Aug 25, 2011 at 9:15 PM, Alejandro Abdelnur <t...@cloudera.com> wrote: > I'd suggest putting hadoop-tools either at trunk/ level or having a a tools > aggregator module for hdfs and other for common. > > I personal would prefer at trunk/. > > Thanks. > > Alejandro > > On Thu, Aug 25, 2011 at 9:06 PM, Amareshwari Sri Ramadasu < > amar...@yahoo-inc.com> wrote: > >> Agree. It should be separate maven module (and patch puts it as separate >> maven module now). And top level for hadoop tools is nice to have, but it >> becomes hard to maintain until patch automation tests run the tests under >> tools. Currently we see many times the changes in HDFS effecting RAID tests >> in MapReduce. So, I'm fine putting the tools under hadoop-mapreduce. >> >> I propose we can have something like the following: >> >> trunk/ >> - hadoop-mapreduce >> - hadoop-mr-client >> - hadoop-yarn >> - hadoop-tools >> - hadoop-streaming >> - hadoop-archives >> - hadoop-distcp >> >> Thoughts? >> >> @Eli and @JD, we did not replace old legacy distcp because this is really a >> complete rewrite and did not want to remove it until users are familiarized >> with new one. >> >> On 8/26/11 12:51 AM, "Todd Lipcon" <t...@cloudera.com> wrote: >> >> Maybe a separate toplevel for hadoop-tools? Stuff like RAID could go >> in there as well - ie tools that are downstream of MR and/or HDFS. >> >> On Thu, Aug 25, 2011 at 12:09 PM, Mahadev Konar <maha...@hortonworks.com> >> wrote: >> > +1 for a seperate module in hadoop-mapreduce-project. I think >> > hadoop-mapreduce-client might not be right place for it. We might have >> > to pick a new maven module under hadoop-mapreduce-project that could >> > host streaming/distcp/hadoop archives. >> > >> > thanks >> > mahadev >> > >> > On Thu, Aug 25, 2011 at 11:04 AM, Alejandro Abdelnur <t...@cloudera.com> >> wrote: >> >> Agree, it should be a separate maven module. >> >> >> >> And it should be under hadoop-mapreduce-client, right? >> >> >> >> And now that we are in the topic, the same should go for streaming, no? >> >> >> >> Thanks. >> >> >> >> Alejandro >> >> >> >> On Thu, Aug 25, 2011 at 10:58 AM, Todd Lipcon <t...@cloudera.com> >> wrote: >> >> >> >>> On Thu, Aug 25, 2011 at 10:36 AM, Eli Collins <e...@cloudera.com> >> wrote: >> >>> > Nice work! I definitely think this should go in 23 and 20x. >> >>> > >> >>> > Agree with JD that it should be in the core code, not contrib. If >> >>> > it's going to be maintained then we should put it in the core code. >> >>> >> >>> Now that we're all mavenized, though, a separate maven module and >> >>> artifact does make sense IMO - ie "hadoop jar >> >>> hadoop-distcp-0.23.0-SNAPSHOT" rather than "hadoop distcp" >> >>> >> >>> -Todd >> >>> -- >> >>> Todd Lipcon >> >>> Software Engineer, Cloudera >> >>> >> >> >> > >> >> >> >> -- >> Todd Lipcon >> Software Engineer, Cloudera >> >> > -- -Giri