[
https://issues.apache.org/jira/browse/HDFS-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13733186#comment-13733186
]
Lars Hofhansl commented on HDFS-2834:
-------------------------------------
Just for reference with many open files one can easily OOM on direct buffer
memory. See: HBASE-8143.
1MB seems to be a rather large default.
> ByteBuffer-based read API for DFSInputStream
> --------------------------------------------
>
> Key: HDFS-2834
> URL: https://issues.apache.org/jira/browse/HDFS-2834
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs-client, performance
> Reporter: Henry Robinson
> Assignee: Henry Robinson
> Fix For: 2.0.2-alpha
>
> Attachments: HDFS-2834.10.patch, HDFS-2834.11.patch,
> HDFS-2834.3.patch, HDFS-2834.4.patch, HDFS-2834.5.patch, HDFS-2834.6.patch,
> HDFS-2834.7.patch, HDFS-2834.8.patch, HDFS-2834.9.patch,
> hdfs-2834-libhdfs-benchmark.png, HDFS-2834-no-common.patch, HDFS-2834.patch,
> HDFS-2834.patch
>
>
> The {{DFSInputStream}} read-path always copies bytes into a JVM-allocated
> {{byte[]}}. Although for many clients this is desired behaviour, in certain
> situations, such as native-reads through libhdfs, this imposes an extra copy
> penalty since the {{byte[]}} needs to be copied out again into a natively
> readable memory area.
> For these cases, it would be preferable to allow the client to supply its own
> buffer, wrapped in a {{ByteBuffer}}, to avoid that final copy overhead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira