Saurabh Agarwal wrote:
Hemanth,


Thanks!!
Saurabh Agarwal


On Fri, May 14, 2010 at 9:49 AM, Hemanth Yamijala <yhema...@gmail.com>wrote:

Saurabh,

 let me re frame my question I wanted to knowhow job tracker decides the
assignment of input splits to task tracker based on task tracker's data
locality. Where is this policy defined? Is it pluggable?
Sorry, I misunderstood your question then. This code is in
o.a.h.mapred.JobInProgress. It is likely spread across many methods in
the class. But a good starting point could be from methods like
obtainNewMapTask or obtainNewReduceTask.

At the moment, this policy is not pluggable. But I know there have
been discussions (possibly even a JIRA, though I can't locate any now)
asking for this capability.


+1 to having some plugin interface in 0.22+ to give you control.

My fomer colleague russ perry did some rendering with Hadoop where he wanted the work done not where the input data was, but where the output data was needed; there was no way to do this
http://www.hpl.hp.com/techreports/2009/HPL-2009-345.pdf

Reply via email to