Janus Chow created HADOOP-17421:
-----------------------------------

             Summary: Specify user's queue via configuration in FairCallQueue 
                 Key: HADOOP-17421
                 URL: https://issues.apache.org/jira/browse/HADOOP-17421
             Project: Hadoop Common
          Issue Type: Improvement
            Reporter: Janus Chow


The feature of FairCallQueue helps a lot in maintaining a fair and good service 
in a multi-tenant cluster, each user is assigned to queues with different 
priority to reach this goal. But in production, we met some problems that the 
automatic assignment won't fit, the problems are as follows:
 # We have a service account that would send more NN requests, for some 
reasons, we would like to keep this user and allow this user to keep this 
volume of operations. When we deployed FairCallQueue, this service user would 
be treated as a bad user and assigned to a lower queue, causing some slowness 
on the service account.
 # We are having more Flink jobs writing checkpoints to our NN, and the 
checkpoint operations have a characteristic that they would have a periodically 
high cost on the NN with an interval of several minutes. FairCallQueue (with 
cost-based enabled) doesn't have good control of this kind of operations 
because when this kind of operations starts, the cost in the decay window of 
this user is quite low, so the user will be assigned to queue 0, after some 
windows, when the users' high cost has got the attention and assigned to a 
lower queue, the user's operations are already finished. 

For problem 1, we noticed that there is already an option mentioned in 
HADOOP-17165, but in our case, the service account isn't that important that 
we'd allow it to always be assigned to queue 0. 

To solve these problems, we'd like to raise a solution by specifying the queue 
for some static users via config. The basic design is as follows:
 * Specify the static users in config for each queue.
 * Load the mapping from the config while initializing the callqueue.
 * Check the configured queue for each user when assigning the queue.
 * The cost time of the static users would not be count in our decay 
calculation to mitigate the impacts on other normal users' costs.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to