[
https://issues.apache.org/jira/browse/HDFS-13719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kitti Nanasi updated HDFS-13719:
--------------------------------
Status: Patch Available (was: Open)
> Docs around dfs.image.transfer.timeout are misleading
> -----------------------------------------------------
>
> Key: HDFS-13719
> URL: https://issues.apache.org/jira/browse/HDFS-13719
> Project: Hadoop HDFS
> Issue Type: Improvement
> Affects Versions: 3.1.0
> Reporter: Kitti Nanasi
> Assignee: Kitti Nanasi
> Priority: Major
> Labels: hdfs
> Attachments: HDFS-13719.001.patch
>
>
> The Jira https://issues.apache.org/jira/browse/HDFS-1490 added the parameter
> dfs.image.transfer.timeout to HDFS. From the patch (and checking the current
> code), we can see this parameter governs a socket timeout on the a
> java.net.HttpURLConnection object:
> {code:java}
> + if (timeout <= 0) {
> + // Set the ping interval as timeout
> + Configuration conf = new HdfsConfiguration();
> + timeout = conf.getInt(DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_KEY,
> + DFSConfigKeys.DFS_IMAGE_TRANSFER_TIMEOUT_DEFAULT);
> + }
> +
> + if (timeout > 0) {
> + connection.setConnectTimeout(timeout);
> + connection.setReadTimeout(timeout);
> + }
> +
> {code}
> In the above 'connection' is a java.net.HttpURLConnection.
> There is a general disbelief in the community that dfs.image.transfer.timeout
> is the time the entire image must transfer within, however that does not
> appear to be the case. The timeout is actually the max time the client will
> block on the socket before giving up if it cannot get data to read. I guess
> the idea here is to protect the client from hanging forever if the server
> hangs.
> The docs in hdfs-site.xml are partly what causes this confusion, as they are
> very misleading:
> {code:xml}
> <property>
> <name>dfs.image.transfer.timeout</name>
> <value>60000</value>
> <description>
> Socket timeout for image transfer in milliseconds. This timeout and
> the related
> dfs.image.transfer.bandwidthPerSec parameter should be configured such
> that normal image transfer can complete successfully.
> This timeout prevents client hangs when the sender fails during
> image transfer. This is socket timeout during image tranfer.
> </description>
> </property>
> {code}
> The start and end of the statement is accurate, but the part "This timeout
> and the related dfs.image.transfer.bandwidthPerSec parameter should be
> configured such that normal image transfer can complete successfully." is
> misleading. There is almost never a reason to change the above in conjunction
> with the bandwidth setting.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]