[ 
https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565919#action_12565919
 ] 

Owen O'Malley commented on HADOOP-2027:
---------------------------------------

Note that we also need Map/Reduce to use the new method so that it only does 
one call per a file to get block sizes. This would require that FileSplit have 
a new constructor that takes an array of locations rather than computing it on 
demand. The locations do NOT need to be serialized in the read/write fields 
methods. FileInputFormat should use a single call to getFileLocations rather 
than the current getSize, getBlockSize, and getFileCacheHints (down in 
FileSplit).

> FileSystem should provide byte ranges for file locations
> --------------------------------------------------------
>
>                 Key: HADOOP-2027
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2027
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs
>            Reporter: Owen O'Malley
>            Assignee: lohit vijayarenu
>
> FileSystem's getFileCacheHints should be replaced with something more useful. 
> I'd suggest replacing getFileCacheHints with a new method:
> {code}
> BlockLocation[] getFileLocations(Path file, long offset, long range) throws 
> IOException;
> {code}
> and adding
> {code}
> class BlockLocation implements Writable {
>   String[] getHosts();
>   long getOffset();
>   long getLength();
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to