Re: Task allocation to TaskTrackers

Nigel Daley Wed, 14 Feb 2007 14:10:14 -0800

Yup, that's correct. A rack, then, can be defined however you like.One possibility is that a rack is defined by hosts on the same subnet.


Cheers,
Nige


On Feb 14, 2007, at 1:47 PM, Vasiliy Baranov wrote:

Hi Nigel,

It is so nice to hear from you again!
Thank you for clarifying these. I have a follow up question. Howare racks configured, that is, how does the system know which racka machine is in. I went through the proposal (https://issues.apache.org/jira/secure/attachment/12345251/Rack_aware_HDFS_proposal.pdf) and patch source code (https://issues.apache.org/jira/secure/attachment/12350262/rack.patch), andit looks like the DataNode is implemented so that it receives itsrack's "network location" via the new -r/--rack option or, if thelatter is not specified, by Runtime.execing the"dfs.network.script" script. If both are not specified, theDataNode belongs to the default rack. Correct?
Thank you,
Vasiliy

Nigel Daley wrote:
Hi Vasiliy :)
I have a question regarding task allocation to TaskTrackers(could not find an answer in the docs). When a MapReduce job isrun, does the system attempt to schedule a Map task on a machinethat contains a replica of the task's input data, or not?
Yes, the JobTracker attempts to schedule the map on a nodecontaining that map's input split.
If yes, how does the system know which TaskTracker corresponds towhich DataNode (by IP address, by host name, or by something else)?
See InputSlit.getLocations() (http://svn.apache.org/viewvc/lucene/hadoop/trunk/src/java/org/apache/hadoop/mapred/InputSplit.java?view=markup). Currently, host names are used, but I believe it'smoving to IP address (see https://issues.apache.org/jira/browse/HADOOP-985).
Also, what happens if that fails?
The task is schedule elsewhere. However, now that DataNodes areaware of the rack they are on (as of 0.11.0), the JobTracker needsto be modified so that its fallback is to attempt to locate themap on a node "close" (same rack) as its data.
Cheers,
Nige

Re: Task allocation to TaskTrackers

Reply via email to