[ https://issues.apache.org/jira/browse/HADOOP-6026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12718753#action_12718753 ]
dhruba borthakur commented on HADOOP-6026: ------------------------------------------ One drawback to the above situation is that the mapping of a hostname to its racklocation would be permanent for the lifetime of a JobTracker. To accomodate a more rapidly changing network topology, we can expire items from the cache after every hour or so. > Improve the performance efficiency of task initialization at the JobTracker > --------------------------------------------------------------------------- > > Key: HADOOP-6026 > URL: https://issues.apache.org/jira/browse/HADOOP-6026 > Project: Hadoop Core > Issue Type: Improvement > Components: mapred > Reporter: dhruba borthakur > Assignee: Zheng Shao > > The JobTracker reads the splits for a job at Job Initialization time. Then, > for each location in the split, it invokes DNSToSwitchMapping.resolve(). > This, in turn, typically invokes an external script that resolves the > hostname to a network rack location. The time spent in invoking this external > script can be reduced if the hostname and their rack locations are inserted > into a cache. JobTracker.resolveAndAddToTopology() can look up this cache > first and avoid invoking the external "resolve" script is most cases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.