[
https://issues.apache.org/jira/browse/HADOOP-11697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14354237#comment-14354237
]
Yongjun Zhang commented on HADOOP-11697:
----------------------------------------
Hi [~eddyxu],
Thanks for your work here. The patch looks good to me in terms of changing the
default value of {{fs.s3a.connection.timeout}}. I have two comments:
1.
The original setting in com.amazonaws.ClientConfiguration has the default of 50
seconds:
{code}
DEFAULT_SOCKET_TIMEOUT = 50 * 1000
{code}
We are changing from 50 seconds to 1800 seconds which is a big change. I bet
this number is the result of testing in real cluster environment. Would you
please explain why it has to be this big in this context? Probably even add the
description as a comment to Constants.java.
2. I noticed that you corrected some comments from "seconds" to "milliseconds"
which is very good. One thing I checked with [~cmccabe] and he pointed out that
usually the config names for timeout would contain a section to indicate the
time unit. For example, if it's milliseconds, then the config names will be
"x.y.z.ms". This can be done as future work by adding new config properties
with the expected names and adding the old ones as supported but deprecated
properties.
Thanks.
> Use larger value for fs.s3a.connection.timeout.
> -----------------------------------------------
>
> Key: HADOOP-11697
> URL: https://issues.apache.org/jira/browse/HADOOP-11697
> Project: Hadoop Common
> Issue Type: Improvement
> Affects Versions: 2.6.0
> Reporter: Lei (Eddy) Xu
> Assignee: Lei (Eddy) Xu
> Priority: Minor
> Labels: s3
> Attachments: HADOOP-11697.001.patch, HDFS-7908.000.patch
>
>
> The default value of {{fs.s3a.connection.timeout}} is {{50000}} milliseconds.
> It causes many {{SocketTimeoutException}} when uploading large files using
> {{hadoop fs -put}}.
> Also, the units for {{fs.s3a.connection.timeout}} and
> {{fs.s3a.connection.estaablish.timeout}} are milliseconds. For s3
> connections, I think it is not necessary to have sub-seconds timeout value.
> Thus I suggest to change the time unit to seconds, to easy sys admin's job.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)