[
https://issues.apache.org/jira/browse/HDFS-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595334#comment-13595334
]
Mark Wagner commented on HDFS-4551:
-----------------------------------
Hi Nicholas,
I have observed significant performance increase (7-8x) at copying a 1GB file
when the server has set io.file.buffer.size at 64kB (and the buffer size isn't
specified in the request). Of course you can manually set the buffer size to
4096 bytes, but then that affects the buffer size used to open the file also.
I think we may have gotten crossed up about what is changing. This patch only
changes the buffer size to copy from the FileInputStream onto network. My
understanding is that both WebHDFS and hftp on trunk eventually end up at:
{code:title=IOUtils.java|borderStyle=solid}
126 public static void copyBytes(InputStream in, OutputStream out, long count,
127 boolean close) throws IOException {
128 byte buf[] = new byte[4096];
129 long bytesRemaining = count;
{code}
which is what this patch is trying to match. Is that your understanding also?
There's an argument to be made that this should be configurable, but I figured
it best to copy what trunk does.
> Change WebHDFS buffersize behavior to improve default performance
> -----------------------------------------------------------------
>
> Key: HDFS-4551
> URL: https://issues.apache.org/jira/browse/HDFS-4551
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: webhdfs
> Affects Versions: 1.1.2
> Reporter: Mark Wagner
> Assignee: Mark Wagner
> Attachments: HDFS-4551.1.patch
>
>
> Currently on 1.X branch, the buffer size used to copy bytes to network
> defaults to io.file.buffer.size. This causes performance problems if that
> buffersize is large.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira