[ https://issues.apache.org/jira/browse/HDFS-13119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361133#comment-16361133 ]
Íñigo Goiri commented on HDFS-13119: ------------------------------------ * Haven't been able to go too deep but it seems like killing the mini cluster messes with the other unit tests. * I think we could also test {{renewLease()}} to see if it succeeds when just one nameservice is down; in addition, we may want to put metrics for the number of retries and check them in the unit test. * One thing I was considering was being able to try at least once if we don't have an NN ACTIVE; something like setting {{retryCount}} to he max value -1. * I think {{DFS_ROUTER_CLIENT_THREADS_SIZE_DEFAULT}} in {{DFSConfigKeys}} fits in one line. > RBF: Manage unavailable clusters > -------------------------------- > > Key: HDFS-13119 > URL: https://issues.apache.org/jira/browse/HDFS-13119 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Íñigo Goiri > Assignee: Yiqun Lin > Priority: Major > Attachments: HDFS-13119.001.patch > > > When a federated cluster has one of the subcluster down, operations that run > in every subcluster ({{RouterRpcClient#invokeAll()}}) may take all the RPC > connections. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org