Hi everyone,

I have been fleshing out a slightly more detailed plan for the upcoming weeks, and would like to share it with the community and get some feedback.

For the first iteration I am planning to focus on Java implemented Map- Reduce (MR) applications. These apps interface directly with Hadoop's MR framework[1], as opposed to Hadoop Streaming[2] or Pipes[3].

I think the first priority should be to get a basic MR data flow working, and the three necessary entities of a basic MR application seem to be the Mapper, the Reducer, and a job configuration. I am planning on getting the functionality for these three parts implemented first.

Going along with the original design in the proposal, I am planning to view the Mapper and the Reducer as implementation types, and the job configuration as part of a management layer in charge of the assembly and deployment of MR applications. Initially the management layer would be responsible for the configuration of MR jobs and the integration with Hadoop's MR framework, with the overall goal of eventually extending it into something more along the lines of what was described by Robert Donkin[4] and Jean-Sebastian (referred to as item 3)[5]. In this case, the layer could be used to manage the deployment of components over a Hadoop cluster itself.

For the Mapper and Reducer, in the next couple of weeks I would like to outline the definition of these types and hopefully start implementing them.

For the management layer, I could use some guidance on how to best fit it into the Tuscany architectural framework.

Thoughts/Suggestions on any part of my plans are always greatly appreciated.

Congrats to everyone for graduation!

Thanks,
Chris Trezzo

[1] 
http://hadoop.apache.org/core/docs/r0.15.3/api/org/apache/hadoop/mapred/package-summary.html
[2] http://hadoop.apache.org/core/docs/current/streaming.html
[3] 
http://hadoop.apache.org/core/docs/r0.15.3/api/org/apache/hadoop/mapred/pipes/package-summary.html
[4] http://www.mail-archive.com/tuscany-dev@ws.apache.org/msg29711.html
[5] http://www.mail-archive.com/tuscany-dev@ws.apache.org/msg29720.html

Reply via email to