[
https://issues.apache.org/jira/browse/HADOOP-7754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13136709#comment-13136709
]
Eli Collins commented on HADOOP-7754:
-------------------------------------
This approach seems reasonable to me. The performance improvement is going to
require some implementation level knowledge of the underlying stream, and this
seems to be a minimal way to expose it.
Thinking ahead, we'd like to perform these optimizations on a per-file basis so
we'll need a way to map and plumb flags passed when opening the file to blocks
(as part of DataXferProtocol) in which case we could use a particular flag, or
the presence of particular optimization flags, to indicate whether a stream has
a file descriptor, but I think an explicit interface is cleaner.
> Expose file descriptors from Hadoop-wrapped local FileSystems
> -------------------------------------------------------------
>
> Key: HADOOP-7754
> URL: https://issues.apache.org/jira/browse/HADOOP-7754
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: native
> Affects Versions: 0.23.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Attachments: hasfd.txt
>
>
> In HADOOP-7714, we determined that using fadvise inside of the MapReduce
> shuffle can yield very good performance improvements. But many parts of the
> shuffle are FileSystem-agnostic and thus operate on FSDataInputStreams and
> RawLocalFileSystems. This JIRA is to figure out how to allow
> RawLocalFileSystem to expose its FileDescriptor object without unnecessarily
> polluting the public APIs.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira