On Apr 22, 2009, at 7:17 PM, Samprita Hegde wrote:
Hi,
I am trying see the feasibility of using shared spaces for the
communication of the Task Completion Events in Hadoop. For this I am
trying
to replace the InterTracker Protocol with a co-ordination space so
that one
thread in a Task Tracker puts the MapTask Completion Events on to
the space
and another Thread receives these events sends them to the Reudce
Tasks
launched by that TaskTracker. .Even the Job tracker can subscribe to
these
events to make decisions regarding scheduling/ Restarting the
sluggish tasks
etc ..
It's an interesting experiment, please keep us posted.
Currently all the information seems to be sent via the heartbeat
message in
the InterTracker Protocol. Is there a way where I can decouple only
some
part of heartbeat message and put it on to the space? (Especially the
TaskCompletionEvent and TaskStatus). Using this the task completion
events
can be exchanged directly among the Task Trackers adn not through
the Job
Tracker.
You would need to fix the TaskTrackers to update the shared-space
rather than send it to the JobTracker.
One pertinent point to remember is that you do need some global
arbitration for e.g. deciding which among the concurrently successful
speculative tasks are to be declared as 'successful' etc.
Arun
I am not sure if this strategy is good for large scale Map_Reduce
applications. But it might work well for small scale Map-reduce jobs.
More Information on the co-ordination space that I want to use can
be found
here : http://www.caip.rutgers.edu/~zhljenny/comet.htm. I am still
going
through Hadoop's code and trying to undersatnd the various protocol
between
the processes. If you have any Good documentation regarding Hadoop's
architecture, it would be really helpful for me.
Thanks a lot in advance,
Samprita Hegde