[ https://issues.apache.org/jira/browse/HDFS-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yuxuan Wang reassigned HDFS-15419: ---------------------------------- Assignee: Yuxuan Wang > Router should retry communicate with NN when cluster is unavailable using > configurable time interval > ---------------------------------------------------------------------------------------------------- > > Key: HDFS-15419 > URL: https://issues.apache.org/jira/browse/HDFS-15419 > Project: Hadoop HDFS > Issue Type: Improvement > Components: configuration, hdfs-client, rbf > Reporter: bhji123 > Assignee: Yuxuan Wang > Priority: Major > > When cluster is unavailable, router -> namenode communication will only retry > once without any time interval, that is not reasonable. > For example, in my company, which has several hdfs clusters with more than > 1000 nodes, we have encountered this problem. In some cases, the cluster > becomes unavailable briefly for about 10 or 30 seconds, at the same time, > almost all rpc requests to router failed because router only retry once > without time interval. > It's better for us to enhance the router retry strategy, to retry > **communicate with NN using configurable time interval and max retry times. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org