+1 This will allow Hadoop to better compete with GoDaddy's "Hadoop Killer" skunkworks project.
On Fri, Apr 1, 2011 at 11:26 AM, Nigel Daley <[email protected]> wrote: > -1+2. This could potentially allow us to replace Jenkins with Hadoop for > our build and test infrastructure. That would be awesome! > > n. > > On Apr 1, 2011, at 1:57 AM, Chris Douglas wrote: > > > Experience developing Hadoop has shown that we not only need to > > partition our projects for more active releases, but we also should > > explore speculative project splits. For this, a Hadoop.next() project > > should track the development of a project scheduler that can partition > > the Hadoop subprojects, possibly running a second version of a > > subproject in parallel. Downstream subprojects and TLPs automatically > > accept whichever releases first as a dependency. Implementation should > > combine ant, ivy, maven, and at least one legacy Hadoop build tool (to > > be written). > > > > Of course, not all of these subprojects will succeed. When one fails > > (or is too slow with its project reports), the project scheduler will > > be responsible for respawning it in the Incubator. > > > > The project scheduler will, of course, be pluggable. -C > > > > On Fri, Apr 1, 2011 at 1:19 AM, Aaron T. Myers <[email protected]> wrote: > >> Hello Hadoop Community, > >> > >> Given the tremendous positive feedback we've all had regarding the HDFS, > >> MapReduce, and Common project split, I'd like to propose we take the > next > >> step and further separate the existing projects. > >> > >> I propose we begin by splitting the MapReduce project into separate > "Map" > >> and "Reduce" sub-projects. This will provide us the opportunity to tease > out > >> the complex interdependencies between "map" and "reduce" that exist > today, > >> to encourage us to write more modular and isolated code, which should > speed > >> releases. This will also aid our users who exclusively run map-only or > >> reduce-only jobs. These are important use-cases, and so should be given > high > >> priority. > >> > >> Given that these two portions of the existing MapReduce project share a > >> great deal of code, we will likely need to release these two new > projects > >> concurrently at first, but the eventual goal should certainly be to be > able > >> to release "Map" and "Reduce" independently. This seems intuitive to me, > >> given the remarkable recent advancements in the academic community > regarding > >> "reduce," while the research coming out of the "map" academics has > largely > >> stagnated of late. > >> > >> If this proposal is accepted, and it has the success I think it will, > then > >> we should strongly consider splitting the other two projects as well. My > gut > >> instinct is that we should split "HDFS" into "HD" and "FS" sub-projects, > and > >> simply rename the "Common" project to "C'Mon." We can think about the > >> details of what exactly these project splits mean later. > >> > >> Please let me know what you think. > >> > >> Best, > >> Aaron > >> > >
