[ 
https://issues.apache.org/jira/browse/HADOOP-7754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13136709#comment-13136709
 ] 

Eli Collins commented on HADOOP-7754:
-------------------------------------

This approach seems reasonable to me. The performance improvement is going to 
require some implementation level knowledge of the underlying stream, and this 
seems to be a minimal way to expose it.

Thinking ahead, we'd like to perform these optimizations on a per-file basis so 
we'll need a way to map and plumb flags passed when opening the file to blocks 
(as part of DataXferProtocol) in which case we could use a particular flag, or 
the presence of particular optimization flags, to indicate whether a stream has 
a file descriptor, but I think an explicit interface is cleaner.
                
> Expose file descriptors from Hadoop-wrapped local FileSystems
> -------------------------------------------------------------
>
>                 Key: HADOOP-7754
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7754
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: native
>    Affects Versions: 0.23.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>         Attachments: hasfd.txt
>
>
> In HADOOP-7714, we determined that using fadvise inside of the MapReduce 
> shuffle can yield very good performance improvements. But many parts of the 
> shuffle are FileSystem-agnostic and thus operate on FSDataInputStreams and 
> RawLocalFileSystems. This JIRA is to figure out how to allow 
> RawLocalFileSystem to expose its FileDescriptor object without unnecessarily 
> polluting the public APIs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to