[
https://issues.apache.org/jira/browse/HDFS-16646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18036569#comment-18036569
]
ASF GitHub Bot commented on HDFS-16646:
---------------------------------------
github-actions[bot] commented on PR #4519:
URL: https://github.com/apache/hadoop/pull/4519#issuecomment-3507233808
We're closing this stale PR because it has been open for 100 days with no
activity. This isn't a judgement on the merit of the PR in any way. It's just a
way of keeping the PR queue manageable.
If you feel like this was a mistake, or you would like to continue working
on it, please feel free to re-open it and ask for a committer to remove the
stale tag and review again.
Thanks all for your contribution.
> RBF: Support an elastic RouterRpcFairnessPolicyController
> ---------------------------------------------------------
>
> Key: HDFS-16646
> URL: https://issues.apache.org/jira/browse/HDFS-16646
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: ZanderXu
> Assignee: ZanderXu
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1.5h
> Remaining Estimate: 0h
>
> As we all known, `StaticRouterRpcFairnessPolicyController` is very helpfully
> for RBF to minimize impact of clients connecting to healthy vs unhealthy
> nameNodes.
> But in prod environment, the traffic of clients accessing each NS and the
> pressure of downstream namenodes are dynamically changed. So if we only have
> one static permit conf, RBF cannot able to adapt to the changes in traffic to
> achieve optimal results.
> So here I propose an elastic RouterRpcFairnessPolicyController to help RBF
> adapt to traffic changes to achieve an optimal result.
> The overall idea is:
> * Each name service can configured the exclusive permits like
> `StaticRouterRpcFairnessPolicyController`
> * TotalPermits is more than sum(NsExclusivePermit) and mark TotalPermits -
> sum(NsExclusivePermit) as SharedPermits
> * Each name service can properly preempt the SharedPermits after it's own
> exclusive permits is used up.
> * But the maximum value of SharedPermits preempted by each nameservice should
> be limited. Such as 20% of SharedPermits.
> Suppose we have 200 handlers and 5 name services, and each name services
> configured different exclusive Permits, like:
> | NS1 | NS2 | NS3 | NS4 | NS5 | Concurrent NS |
> |-- | -- | -- | -- | -- | -- |
> | 9 | 11 | 8 | 12 | 10 | 50 |
> The `sum(NsExclusivePermit)` is 100, and the `SharedPermits =
> TotalPermits(200) - Sum(NsExclusivePermit)(100) = 100`
> Suppose we configure that each nameservice can preempt up to 20% of
> TotalPermits, marked as `elasticPercent`.
> Then from the point view of a single NS, the permits it may be can use are as
> follow:
> - Exclusive Permits, which is cannot be used by other name services.
> - Limited SharedPermits, whether is can use so many shared permits depends on
> the remaining number of SharedPermits, because the SharedPermits is be
> preempted by all nameservices.
> If we configure the `elasticPercent=100`, it means one nameservices can use
> up all SharedPermits.
> If we configure the `elasticPercent=0`, it means nameservice can only use
> it's exclusive Permits.
> If we configure the `elasticPercent=20`, it means that the RBF can tolerate 5
> unhealthy name services at the same time.
> In our prod environment, we configured as follow, and it works well:
> - RBF has 3000 handlers
> - Each nameservice has 10 exclusive permits
> - `elasticPercent` is 30%
> Of course, we need to configure reasonable parameters according to the prod
> traffic.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]