[ 
https://issues.apache.org/jira/browse/HDFS-8820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647162#comment-14647162
 ] 

Ming Ma commented on HDFS-8820:
-------------------------------

Thanks [~arpitagarwal]. Should we enable this for communication between DN and 
NN? It appears RetriableException is only supported by 
FailoverOnNetworkExceptionRetry used by client for NN HA scenario; DN doesn't 
use that retry policy when it communicates with NN. In our clusters, we 
configure service port on NN so DN RPCs go to the service RPC server and 
backoff isn't enabled on that service RPC server. We can have DN use retry 
policy that supports RetriableException; but that will require extra work.

For the configuration part, I wonder if we should use the pattern similar to 
RPC's {{setProtocolEngine}},or {{ipc.server.read.threadpool.size}} where NN or 
other services can call {{RPC.Builder#setnumReaders}} to override the value. In 
that way, the NN doesn't need to know the format of the configuration key name. 

> Enable RPC Congestion control by default
> ----------------------------------------
>
>                 Key: HDFS-8820
>                 URL: https://issues.apache.org/jira/browse/HDFS-8820
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Arpit Agarwal
>            Assignee: Arpit Agarwal
>         Attachments: HDFS-8820.01.patch, HDFS-8820.02.patch
>
>
> We propose enabling RPC congestion control introduced by HADOOP-10597 by 
> default.
> We enabled it on a couple of large clusters a few weeks ago and it has helped 
> keep the namenodes responsive under load.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to