Hi everyone,
I have been fleshing out a slightly more detailed plan for the
upcoming weeks, and would like to share it with the community and get
some feedback.
For the first iteration I am planning to focus on Java implemented Map-
Reduce (MR) applications. These apps interface directly with Hadoop's
MR framework[1], as opposed to Hadoop Streaming[2] or Pipes[3].
I think the first priority should be to get a basic MR data flow
working, and the three necessary entities of a basic MR application
seem to be the Mapper, the Reducer, and a job configuration. I am
planning on getting the functionality for these three parts
implemented first.
Going along with the original design in the proposal, I am planning to
view the Mapper and the Reducer as implementation types, and the job
configuration as part of a management layer in charge of the assembly
and deployment of MR applications. Initially the management layer
would be responsible for the configuration of MR jobs and the
integration with Hadoop's MR framework, with the overall goal of
eventually extending it into something more along the lines of what
was described by Robert Donkin[4] and Jean-Sebastian (referred to as
item 3)[5]. In this case, the layer could be used to manage the
deployment of components over a Hadoop cluster itself.
For the Mapper and Reducer, in the next couple of weeks I would like
to outline the definition of these types and hopefully start
implementing them.
For the management layer, I could use some guidance on how to best fit
it into the Tuscany architectural framework.
Thoughts/Suggestions on any part of my plans are always greatly
appreciated.
Congrats to everyone for graduation!
Thanks,
Chris Trezzo
[1]
http://hadoop.apache.org/core/docs/r0.15.3/api/org/apache/hadoop/mapred/package-summary.html
[2] http://hadoop.apache.org/core/docs/current/streaming.html
[3]
http://hadoop.apache.org/core/docs/r0.15.3/api/org/apache/hadoop/mapred/pipes/package-summary.html
[4] http://www.mail-archive.com/tuscany-dev@ws.apache.org/msg29711.html
[5] http://www.mail-archive.com/tuscany-dev@ws.apache.org/msg29720.html