I like the idea of having tools as a seperate module and I dont think that it will be a dumping ground unless we choose to make one of it.
+1 for hadoop tools module under trunk. thanks mahadev On Wed, Sep 7, 2011 at 11:18 AM, Alejandro Abdelnur <t...@cloudera.com> wrote: > Agreed, we should not have a dumping ground. IMO, what it would go into > hadoop-tools (i.e. distcp, streaming and someone could argue for FsShell as > well) are effectively hadoop CLI utilities. Having them in a separate module > rather in than in the core module (common, hdfs, mapreduce) does not mean > that they are secondary things, just modularization. Also it will help to > get those tools to use public interfaces of the core module, and when we > finally have a clean hadoop-client layer, those tools should only depend on > that. > > Finally, the fact that tools would end up under trunk/hadoop-tools, it does > not prevent that the packaging from HDFS and MAPREDUCE to bundle the > same/different tools > > +1 for hadoop-tools/ (not binding) > > Thanks. > > > On Wed, Sep 7, 2011 at 10:50 AM, Eric Yang <eric...@gmail.com> wrote: > >> Mapreduce and HDFS are distinct function of Hadoop. They are loosely >> coupled. If we have tools aggregator module, it will not have as >> clear distinct function as other Hadoop modules. Hence, it is >> possible for a tool to be depend on both HDFS and map reduce. If >> something broke in tools module, it is unclear which subproject's >> responsibility to maintain tools function. Therefore, it is safer to >> send tools to incubator or apache extra rather than deposit the >> utility tools in tools subcategory. There are many short lived >> projects that attempts to associate themselves with Hadoop but not >> being maintained. It would be better to spin off those utility >> projects than use Hadoop as a dumping ground. >> >> The previous discussion for removing contrib, most people were in >> favor of doing so, and only a few contrib owners were reluctant to >> remove contrib. Fewer people has participated in restore >> functionality of broken contrib projects. History speaks for itself. >> -1 (non-binding) for hadoop-tools. >> >> regards, >> Eric >> >> On Tue, Sep 6, 2011 at 6:55 PM, Alejandro Abdelnur <t...@cloudera.com> >> wrote: >> > Eric, >> > >> > Personally I'm fine either way. >> > >> > Still, I fail to see why a generic/categorized tools increase/reduce the >> > risk of dead code and how they make more-difficult/easier the >> > package&deployment. >> > >> > Would you please explain this? >> > >> > Thanks. >> > >> > Alejandro >> > >> > On Tue, Sep 6, 2011 at 6:38 PM, Eric Yang <eric...@gmail.com> wrote: >> > >> >> Option #2 proposed by Amareshwari, seems like a better proposal. We >> don't >> >> want to repeat history for contrib again with hadoop-tools. Having a >> >> generic module like hadoop-tools increases the risk of accumulate dead >> code. >> >> It would be better to categorize the hdfs or mapreduce specific tools >> in >> >> their respected subcategories. It is also easier to manage from >> >> package/deployment prospective. >> >> >> >> regards, >> >> Eric >> >> >> >> On Sep 6, 2011, at 4:32 PM, Eli Collins wrote: >> >> >> >> > On Tue, Sep 6, 2011 at 10:11 AM, Allen Wittenauer <a...@apache.org> >> wrote: >> >> >> >> >> >> On Sep 6, 2011, at 9:30 AM, Vinod Kumar Vavilapalli wrote: >> >> >>> We still need to answer Amareshwari's question (2) she asked some >> time >> >> back >> >> >>> about the automated code compilation and test execution of the tools >> >> module. >> >> >> >> >> >> >> >> >> >> >> >>>>> My #1 question is if tools is basically contrib reborn. If not, >> what >> >> >>>> makes >> >> >>>>> it different? >> >> >> >> >> >> >> >> >> I'm still waiting for this answer as well. >> >> >> >> >> >> Until such, I would be pretty much against a tools module. >> >> Changing the name of the dumping ground doesn't make it any less of a >> >> dumping ground. >> >> > >> >> > IMO if the tools module only gets stuff like distcp that's maintained >> >> > then it's not contrib, if it contains all the stuff from the current >> >> > MR contrib then tools is just a re-labeling of contrib. Given that >> >> > this proposal only covers moving distcp to tools it doesn't sound like >> >> > contrib to me. >> >> > >> >> > Thanks, >> >> > Eli >> >> >> >> >> > >> >