[
https://issues.apache.org/jira/browse/HADOOP-7397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Allen Wittenauer updated HADOOP-7397:
-------------------------------------
Labels: BB2015-05-TBR hadoop (was: hadoop)
> Allow configurable timeouts when connecting to HDFS via java FileSystem API
> ---------------------------------------------------------------------------
>
> Key: HADOOP-7397
> URL: https://issues.apache.org/jira/browse/HADOOP-7397
> Project: Hadoop Common
> Issue Type: Improvement
> Components: ipc
> Affects Versions: 0.23.0
> Reporter: Scott Fines
> Priority: Minor
> Labels: BB2015-05-TBR, hadoop
> Attachments: HADOOP-7397.patch, timeout.patch
>
>
> If the NameNode is not available (in, for example, a network partition event
> separating the client from the NameNode), and an attempt is made to connect,
> then the FileSystem api will *eventually* timeout and throw an error.
> However, that timeout is currently hardcoded to be 20 seconds to connect,
> with 45 retries, for a total of a 15 minute wait before failure. While in
> many circumstances this is fine, there are also many circumstances (such as
> booting a service) where both the connection timeout and the number of
> retries should be significantly less, so as not to harm availability of other
> services.
> Investigating Client.java, I see that there are two fields in Connection:
> maxRetries and rpcTimeout. I propose either re-using those fields for
> initiating the connection as well; alternatively, using the already existing
> dfs.socket.timeout parameter to set the connection timeout on initialization,
> and potentially adding a new field such as dfs.connection.retries with a
> default of 45 to replicate current behaviors.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)