ZanderXu opened a new pull request, #4519: URL: https://github.com/apache/hadoop/pull/4519
### Description of PR As we all known, `StaticRouterRpcFairnessPolicyController` is very helpfully for RBF to minimize impact of clients connecting to healthy vs unhealthy nameNodes. But in prod environment, the traffic of clients accessing each NS and the pressure of downstream namenodes are dynamically changed. So if we only have one static permit conf, RBF cannot able to adapt to the changes in traffic to achieve optimal results. So here I propose an elastic RouterRpcFairnessPolicyController to help RBF adapt to traffic changes to achieve an optimal result. The overall idea is: - Each name service can configured the exclusive permits like `StaticRouterRpcFairnessPolicyController` - TotalPermits is more than sum(NsExclusivePermit) and mark TotalPermits - sum(NsExclusivePermit) as SharedPermits - Each name service can properly preempt the SharedPermits after it's own exclusive permits is used up. - But the maximum value of SharedPermits preempted by each nameservice should be limited. Such as 20% of SharedPermits. Suppose we have 200 handlers and 5 name services, and each name services configured different exclusive Permits, like: | NS1 | NS2 | NS3 | NS4 | NS5 | Concurrent NS | |-- | -- | -- | -- | -- | -- | | 9 | 11 | 8 | 12 | 10 | 50 | The `sum(NsExclusivePermit)` is 100, and the `SharedPermits = TotalPermits(200) - Sum(NsExclusivePermit)(100) = 100` Suppose we configure that each nameservice can preempt up to 20% of TotalPermits, marked as `elasticPercent`. Then from the point view of a single NS, the permits it may be can use are as follow: - Exclusive Permits, which is cannot be used by other name services. - Limited SharedPermits, whether is can use so many shared permits depends on the remaining number of SharedPermits, because the SharedPermits is be preempted by all nameservices. If we configure the `elasticPercent=100`, it means one nameservices can use up all SharedPermits. If we configure the `elasticPercent=0`, it means nameservice can only use it's exclusive Permits. If we configure the `elasticPercent=20`, it means that the RBF can tolerate 5 unhealthy name services at the same time. In our prod environment, we configured as follow, and it works well: - RBF has 3000 handlers - Each nameservice has 10 exclusive permits - `elasticPercent` is 30% Of course, we need to configure reasonable parameters according to the prod traffic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
