[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202685#comment-13202685
 ] 

Siddharth Seth commented on MAPREDUCE-3815:
-------------------------------------------

Looked at this a little more. 
This shows up when a split spans across multiple blocks. 
{{getFileBlockLocations}} always returns hostnames. In case of multiple blocks, 
mapred.FileInputFormat ends up using {{BlockLocations.getTopologyPaths}} 
instead of getFileBlockLocations - which returns an IP address.
Will open a MR / HDFS jira once I can find out how this API behaves in the 1.0 
line. Anyone happen to know ?

Meanwhile, changing the description and posting a patch to have the AM resolve 
IPs if they show up.
                
> Data Locality suffers if HDFS returns IPs in getFileBlockLocations
> ------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3815
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3815
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>            Priority: Critical
>         Attachments: MR3815.txt
>
>
> BlockLocation.getHosts() returns IP addresses occasionally. Data locality is 
> affected - since the RM requires hostnames.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to