[
https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358142#comment-14358142
]
Ming Ma commented on HADOOP-10597:
----------------------------------
bq. Did you use the same trigger on the server in production (RPC queue being
full)?
Yes, that is what we use.
bq. What kind of back-off policy are you using on the client?
This patch doesn't add any new policy on the client side. It tries to use the
policy passed from the server if it is specified. But given we don't plan to
support server side policy, the new patch doesn't need to change anything on
the client side. The client side will receive RetriableException and retry
accordingly.
Regarding the client side retry policy we use, we don't config anything
specifically. We use the default. Not config for
{{DFS_CLIENT_RETRY_POLICY_SPEC_KEY}}. Thus we end up with
{{FailoverOnNetworkExceptionRetry}} which uses exponential backoff. The actual
parameters used in the backoff are be based on
{{DFS_CLIENT_FAILOVER_MAX_ATTEMPTS_KEY}},
{{DFS_CLIENT_FAILOVER_SLEEPTIME_BASE_KEY}} and
{{DFS_CLIENT_FAILOVER_SLEEPTIME_MAX_KEY}}.
> Evaluate if we can have RPC client back off when server is under heavy load
> ---------------------------------------------------------------------------
>
> Key: HADOOP-10597
> URL: https://issues.apache.org/jira/browse/HADOOP-10597
> Project: Hadoop Common
> Issue Type: Sub-task
> Reporter: Ming Ma
> Assignee: Ming Ma
> Attachments: HADOOP-10597-2.patch, HADOOP-10597-3.patch,
> HADOOP-10597-4.patch, HADOOP-10597.patch, MoreRPCClientBackoffEvaluation.pdf,
> RPCClientBackoffDesignAndEvaluation.pdf
>
>
> Currently if an application hits NN too hard, RPC requests be in blocking
> state, assuming OS connection doesn't run out. Alternatively RPC or NN can
> throw some well defined exception back to the client based on certain
> policies when it is under heavy load; client will understand such exception
> and do exponential back off, as another implementation of
> RetryInvocationHandler.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)