[
https://issues.apache.org/jira/browse/HADOOP-2027?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12565919#action_12565919
]
Owen O'Malley commented on HADOOP-2027:
---------------------------------------
Note that we also need Map/Reduce to use the new method so that it only does
one call per a file to get block sizes. This would require that FileSplit have
a new constructor that takes an array of locations rather than computing it on
demand. The locations do NOT need to be serialized in the read/write fields
methods. FileInputFormat should use a single call to getFileLocations rather
than the current getSize, getBlockSize, and getFileCacheHints (down in
FileSplit).
> FileSystem should provide byte ranges for file locations
> --------------------------------------------------------
>
> Key: HADOOP-2027
> URL: https://issues.apache.org/jira/browse/HADOOP-2027
> Project: Hadoop Core
> Issue Type: Bug
> Components: fs
> Reporter: Owen O'Malley
> Assignee: lohit vijayarenu
>
> FileSystem's getFileCacheHints should be replaced with something more useful.
> I'd suggest replacing getFileCacheHints with a new method:
> {code}
> BlockLocation[] getFileLocations(Path file, long offset, long range) throws
> IOException;
> {code}
> and adding
> {code}
> class BlockLocation implements Writable {
> String[] getHosts();
> long getOffset();
> long getLength();
> }
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.