[ 
https://issues.apache.org/jira/browse/HDFS-11234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15741243#comment-15741243
 ] 

ASF GitHub Bot commented on HDFS-11234:
---------------------------------------

GitHub user subahugu opened a pull request:

    https://github.com/apache/hadoop/pull/172

    HDFS-11234: Made the socket buffer size configurable with the config …

    …node fs.hdfs.data.socket.size to be set in core-site.xml. If the node is 
not found, the default value is -1, so socket buffer size would not be set.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/subahugu/hadoop HDFS-11234

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/hadoop/pull/172.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #172
    
----
commit e08009d41a706d8171da3179c122114eefad46ec
Author: suresh.bahuguna <[email protected]>
Date:   2016-12-12T07:43:33Z

    HDFS-11234: Made the socket buffer size configurable with the config node 
fs.hdfs.data.socket.size to be set in core-site.xml. If the node is not found, 
the default value is -1, so socket buffer size would not be set.

----


> distcp performance is suboptimal for high bandwidth/high latency setups
> -----------------------------------------------------------------------
>
>                 Key: HDFS-11234
>                 URL: https://issues.apache.org/jira/browse/HDFS-11234
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 2.7.1
>            Reporter: Suresh Bahuguna
>
> Because distcp uses tcp socket with buffer size set to 128K, for a setup 
> which has very high bandwidth but also a very high latency, the throughput is 
> quite poor. This is because tcp stops sending more data till the time it gets 
> the ACKs. By not setting the socket size and letting linux kernel manage the 
> socket, we should be able to get optimal performance.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to