[jira] [Commented] (HADOOP-17356) RPC FairCallQueue for special users

Janus Chow (Jira) Fri, 06 Nov 2020 08:22:19 -0800


    [ 
https://issues.apache.org/jira/browse/HADOOP-17356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17227475#comment-17227475
 ]


Janus Chow commented on HADOOP-17356:
-------------------------------------

We had a test on the performance of service account on default FairScheduler 
implement, the result is pretty well, so I think this ticket is no longer 
needed.
h2. Summary

The default DecayRcpScheduler (without cost-based cost provider) is great. Both 
normal user and service-account's requests are handled pretty well. I think we 
don't have to support service-account in QoS so far.
h2. Test case

The test case was trying to simulate the requests from 5 normal users (1_0, 
1_1, 1_2, 1_3, 1_4) and 1 service account (100_0), the request count ration is 
1:100, which means when a normal user sends 1 request in a round, the service 
account sends 100 requests, which is similar as the situation of "Presto".

The test result is as follows:
||SchedulingDecisionSummary(username : priority 
level)||CallVolumeSummary(username : requests)||ResponseTimeCountInLastWindow
 (level : requests)||AverageResponseTime
 (level : avg responseTime)||
|1_0 : 0
 1_1 : 0
 1_2 : 0
 1_3 : 0
 1_4 : 0
 100_0 : 3|1_0 : 12349
 1_1 : 12055
 1_2 : 12479
 1_3 : 12461
 1_4 : 12476
 100_0 : 1113446|0 : 37059
 1 : 0
 2 : 0
 3 : 659618| 0 : 0.06690177092340112
 1 : 0.0
 2 : 0.0
 3 : 0.07282078110960011|

 
h2. Some points:
 # The *SchedulingDecisionSummary* result shows that 5 normal users (1_0, 1_1, 
1_2, 1_3, 1_4) are assigned to level 0, and the service user (100_0) is 
assigned to level 3.
 # The *CallVolumeSummary* shows the requests count handled, here we can see 
the request ratio is about 12000/1110000 ~= 0.01, which is just like the 
initial traffic settings for normal user and service user, it means both the 
normal user and service user's requests are handled equally, so basically we 
don't have to do special things for service accounts.
 # The requests are randomly getBlockLocations from a directory with 100k files.
 # The test is running on the callQueue without cost-provider, for the case of 
cost-based FairCallQueue, need to do other tests.
 # The test can get a 130k QPS on NN, means the QoS (without cost-based 
scheduler) don't have too much impact on NN's performance.

h2. Other things

The test was implemented on test cluster with a simple test application. Will 
attach the code and detailed test results for future works.
 Detail results: 
[https://docs.google.com/spreadsheets/d/1v98ciJqiQ8eQRWoFPuaT7TQs4yH-QQVXHmOpODt2jm0/edit?usp=sharing]

> RPC FairCallQueue for special users
> -----------------------------------
>
>                 Key: HADOOP-17356
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17356
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Janus Chow
>            Priority: Major
>         Attachments: Implement 0.png, Implement 1.png, Implement 2.png, 
> Implement 3.png
>
>
> In HADOOP-15016, the idea was first raised to support special users by 
> assigning each special user an independent queue with a share. The design was 
> intended for the user to better control the RPC schedule, but there is also a 
> risk that users may add a lot of items in the config of special-users, 
> causing a lot of queues in the RPCScheduler.
> This ticket records some ideas to mitigate the risks while solving the 
> special-user problem based on HADOOP-15016.
> 0. The current implementation is as follows, all users will be treated 
> equally, _multiplexer_ will decide the call count in each queue.
> !Implement 0.png!
> 1. The first idea is to amplify the weight of super-users and resue the 
> initial queues. This idea is easy to implement, but ordinary users and 
> special users would be affected by each other, and it would be difficult for 
> the _multiplexer_ to guarantee the traffic of super-suers.
> !Implement 1.png!
> 2. The second idea is to set up one independent queue for all special users 
> with a config controlling the weight of all special-users. One concern for 
> this idea is that the scheduler between super-users' calls may not be fair.
> !Implement 2.png!
> 3. The third idea is to also use priority queues for special-users based on 
> idea 2, ensuring the fair handling of all super-users. Another benefit of 
> this idea is we can use the queues to implement cost-based calculation.
> !Implement 3.png!
> I think Idea 3 should be a good balance of complexity and useability.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-17356) RPC FairCallQueue for special users

Reply via email to