[ 
https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358142#comment-14358142
 ] 

Ming Ma commented on HADOOP-10597:
----------------------------------

bq. Did you use the same trigger on the server in production (RPC queue being 
full)?
Yes, that is what we use.

bq. What kind of back-off policy are you using on the client?
This patch doesn't add any new policy on the client side. It tries to use the 
policy passed from the server if it is specified. But given we don't plan to 
support server side policy, the new patch doesn't need to change anything on 
the client side. The client side will receive RetriableException and retry 
accordingly.

Regarding the client side retry policy we use, we don't config anything 
specifically. We use the default. Not config for 
{{DFS_CLIENT_RETRY_POLICY_SPEC_KEY}}. Thus we end up with 
{{FailoverOnNetworkExceptionRetry}} which uses exponential backoff. The actual 
parameters used in the backoff are be based on 
{{DFS_CLIENT_FAILOVER_MAX_ATTEMPTS_KEY}}, 
{{DFS_CLIENT_FAILOVER_SLEEPTIME_BASE_KEY}} and 
{{DFS_CLIENT_FAILOVER_SLEEPTIME_MAX_KEY}}.




> Evaluate if we can have RPC client back off when server is under heavy load
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-10597
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10597
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HADOOP-10597-2.patch, HADOOP-10597-3.patch, 
> HADOOP-10597-4.patch, HADOOP-10597.patch, MoreRPCClientBackoffEvaluation.pdf, 
> RPCClientBackoffDesignAndEvaluation.pdf
>
>
> Currently if an application hits NN too hard, RPC requests be in blocking 
> state, assuming OS connection doesn't run out. Alternatively RPC or NN can 
> throw some well defined exception back to the client based on certain 
> policies when it is under heavy load; client will understand such exception 
> and do exponential back off, as another implementation of 
> RetryInvocationHandler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to