[jira] [Commented] (HDFS-15419) router retry with configurable time interval when cluster is unavailable

bhji123 (Jira) Wed, 17 Jun 2020 22:00:10 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17139085#comment-17139085
 ]


bhji123 commented on HDFS-15419:
--------------------------------

[https://github.com/apache/hadoop/pull/2082]

Here is the pr to fix this problem.

> router retry with configurable time interval when cluster is unavailable
> ------------------------------------------------------------------------
>
>                 Key: HDFS-15419
>                 URL: https://issues.apache.org/jira/browse/HDFS-15419
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: configuration, hdfs-client, rbf
>            Reporter: bhji123
>            Priority: Major
>
> When cluster is unavailable, router -> namenode communication will only retry 
> once without any time interval, that is not reasonable.
> For example, in my company, which has several hdfs clusters with more than 
> 1000 nodes, we have encountered this problem. In some cases, the cluster 
> becomes unavailable briefly for about 10 or 30 seconds, at the same time, 
> almost all rpc requests to router failed because router only retry once 
> without time interval.
> It's better for us to enhance the router retry strategy, to retry with 
> configurable time interval and max retry times.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDFS-15419) router retry with configurable time interval when cluster is unavailable

Reply via email to