LOL@Chris!!! On Apr 1, 2011, at 1:57 AM, Chris Douglas wrote:
> Experience developing Hadoop has shown that we not only need to > partition our projects for more active releases, but we also should > explore speculative project splits. For this, a Hadoop.next() project > should track the development of a project scheduler that can partition > the Hadoop subprojects, possibly running a second version of a > subproject in parallel. Downstream subprojects and TLPs automatically > accept whichever releases first as a dependency. Implementation should > combine ant, ivy, maven, and at least one legacy Hadoop build tool (to > be written). > > Of course, not all of these subprojects will succeed. When one fails > (or is too slow with its project reports), the project scheduler will > be responsible for respawning it in the Incubator. > > The project scheduler will, of course, be pluggable. -C > > On Fri, Apr 1, 2011 at 1:19 AM, Aaron T. Myers <[email protected]> wrote: >> Hello Hadoop Community, >> >> Given the tremendous positive feedback we've all had regarding the HDFS, >> MapReduce, and Common project split, I'd like to propose we take the next >> step and further separate the existing projects. >> >> I propose we begin by splitting the MapReduce project into separate "Map" >> and "Reduce" sub-projects. This will provide us the opportunity to tease out >> the complex interdependencies between "map" and "reduce" that exist today, >> to encourage us to write more modular and isolated code, which should speed >> releases. This will also aid our users who exclusively run map-only or >> reduce-only jobs. These are important use-cases, and so should be given high >> priority. >> >> Given that these two portions of the existing MapReduce project share a >> great deal of code, we will likely need to release these two new projects >> concurrently at first, but the eventual goal should certainly be to be able >> to release "Map" and "Reduce" independently. This seems intuitive to me, >> given the remarkable recent advancements in the academic community regarding >> "reduce," while the research coming out of the "map" academics has largely >> stagnated of late. >> >> If this proposal is accepted, and it has the success I think it will, then >> we should strongly consider splitting the other two projects as well. My gut >> instinct is that we should split "HDFS" into "HD" and "FS" sub-projects, and >> simply rename the "Common" project to "C'Mon." We can think about the >> details of what exactly these project splits mean later. >> >> Please let me know what you think. >> >> Best, >> Aaron >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
