[jira] [Commented] (HDFS-4551) Change WebHDFS buffersize behavior to improve default performance

Mark Wagner (JIRA) Wed, 06 Mar 2013 16:16:15 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-4551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595334#comment-13595334
 ]


Mark Wagner commented on HDFS-4551:
-----------------------------------

Hi Nicholas,

I have observed significant performance increase (7-8x) at copying a 1GB file 
when the server has set io.file.buffer.size at 64kB (and the buffer size isn't 
specified in the request). Of course you can manually set the buffer size to 
4096 bytes, but then that affects the buffer size used to open the file also.

I think we may have gotten crossed up about what is changing. This patch only 
changes the buffer size to copy from the FileInputStream onto network. My 
understanding is that both WebHDFS and hftp on trunk eventually end up at: 
{code:title=IOUtils.java|borderStyle=solid}
126 public static void copyBytes(InputStream in, OutputStream out, long count,
127      boolean close) throws IOException {
128    byte buf[] = new byte[4096];
129    long bytesRemaining = count;
{code}
which is what this patch is trying to match. Is that your understanding also? 
There's an argument to be made that this should be configurable, but I figured 
it best to copy what trunk does.
                
> Change WebHDFS buffersize behavior to improve default performance
> -----------------------------------------------------------------
>
>                 Key: HDFS-4551
>                 URL: https://issues.apache.org/jira/browse/HDFS-4551
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: webhdfs
>    Affects Versions: 1.1.2
>            Reporter: Mark Wagner
>            Assignee: Mark Wagner
>         Attachments: HDFS-4551.1.patch
>
>
> Currently on 1.X branch, the buffer size used to copy bytes to network 
> defaults to io.file.buffer.size. This causes performance problems if that 
> buffersize is large.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-4551) Change WebHDFS buffersize behavior to improve default performance

Reply via email to