[
https://issues.apache.org/jira/browse/HDFS-8765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14702143#comment-14702143
]
James Clampffer commented on HDFS-8765:
---------------------------------------
Hi Li,
The short circuit interface will be zero copy, at least in terms of userspace
copies. You'll provide a buffer and data will be read directly into it through
a pread call or similar.
I'm not planning on supporting the hdfs centralized cache in my initial
implementation for two reasons. The first is simplicity; I'd like to get this
up and running as soon as possible. The second is that short circuit reads
will automatically benefit from the local machine's page cache. On modern
operating systems these work very well, and because this is implemented in c++
we don't have to worry about pinning memory and some of the issues that come
with the JVM heap.
After I get the first iteration finished up I'd be really interested in seeing
some benchmarks for your use case to see if explicit cache management would
help things out. It's certainly something I've thought about adding later on.
> Implement local block reader in libhdfspp
> -----------------------------------------
>
> Key: HDFS-8765
> URL: https://issues.apache.org/jira/browse/HDFS-8765
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs-client
> Reporter: James Clampffer
> Assignee: James Clampffer
>
> Implement a block reader that uses the hdfs short circuit protocol to read
> colocated data as efficiently as possible. Implementation will be based on
> BlockReaderLocal.java + the associated JNI bindings.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)