ZanderXu opened a new pull request, #4519:
URL: https://github.com/apache/hadoop/pull/4519

   ### Description of PR
   As we all known, `StaticRouterRpcFairnessPolicyController` is very helpfully 
for RBF to minimize impact of clients connecting to healthy vs unhealthy 
nameNodes. 
   But in prod environment, the traffic of clients accessing each NS and the 
pressure of downstream namenodes are dynamically changed. So if we only have 
one static permit conf, RBF cannot able to adapt to the changes in traffic to 
achieve optimal results. 
   
   So here I propose an elastic RouterRpcFairnessPolicyController to help RBF 
adapt to traffic changes to achieve an optimal result.
   
   The overall idea is:
   - Each name service can configured the exclusive permits like 
`StaticRouterRpcFairnessPolicyController`
   - TotalPermits is more than sum(NsExclusivePermit) and mark TotalPermits - 
sum(NsExclusivePermit) as SharedPermits
   - Each name service can properly preempt the SharedPermits after it's own 
exclusive permits is used up.
   - But the maximum value of SharedPermits preempted by each nameservice 
should be limited. Such as 20% of SharedPermits.
   
   Suppose we have 200 handlers and 5 name services, and each name services 
configured different exclusive Permits, like:
   | NS1 | NS2 | NS3 | NS4 | NS5 | Concurrent NS |
   |-- | -- | -- | -- | -- | -- |
   | 9 | 11 | 8 | 12 | 10 | 50 |
   
   The `sum(NsExclusivePermit)` is 100, and the `SharedPermits = 
TotalPermits(200) - Sum(NsExclusivePermit)(100) = 100`
   Suppose we configure that each nameservice can preempt up to 20% of 
TotalPermits, marked as `elasticPercent`.
   
   Then from the point view of a single NS, the permits it may be can use are 
as follow:
   - Exclusive Permits, which is cannot be used by other name services.
   - Limited SharedPermits, whether is can use so many shared permits depends 
on the remaining number of SharedPermits, because the SharedPermits is be 
preempted by all nameservices.
   
   If we configure the `elasticPercent=100`, it means one nameservices can use 
up all SharedPermits.
   If we configure the `elasticPercent=0`, it means nameservice can only use 
it's exclusive Permits.
   If we configure the `elasticPercent=20`, it means that the RBF can tolerate 
5 unhealthy name services at the same time.
   
   In our prod environment, we configured as follow, and it works well:
   - RBF has 3000 handlers
   - Each nameservice has 10 exclusive permits
   - `elasticPercent` is 30%
   
   Of course, we need to configure reasonable parameters according to the prod 
traffic.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to