[
https://issues.apache.org/jira/browse/HADOOP-15016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei Yan updated HADOOP-15016:
-----------------------------
Attachment: Adding reservation support to NameNode RPC resource_v2.pdf
[~xyao] [~jnp] I put a new design doc to cover both reservation support and
cost-based features for FairCallQueue. Could u provide some feedbacks?
> Add reservation support to RPC FairCallQueue
> --------------------------------------------
>
> Key: HADOOP-15016
> URL: https://issues.apache.org/jira/browse/HADOOP-15016
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Wei Yan
> Assignee: Wei Yan
> Attachments: Adding reservation support to NameNode RPC resource.pdf,
> Adding reservation support to NameNode RPC resource_v2.pdf,
> HADOOP-15016_poc.patch
>
>
> FairCallQueue is introduced to provide RPC resource fairness among different
> users. In current implementation, each user is weighted equally, and the
> processing priority for different RPC calls are based on how many requests
> that user sent before. This works well when the cluster is shared among
> several end-users.
> However, this has some limitations when a cluster is shared among both
> end-users and some service jobs, like some ETL jobs which run under a service
> account and need to issue lots of RPC calls. When NameNode becomes quite
> busy, this set of jobs can be easily backoffed and low-prioritied. We cannot
> simply treat this type jobs as "bad" user who randomly issues too many calls,
> as their calls are normal calls. Also, it is unfair to weight a end-user and
> a heavy service user equally when allocating RPC resources.
> One idea here is to introduce reservation support to RPC resources. That is,
> for some services, we reserve some RPC resources for their calls. This idea
> is very similar to how YARN manages CPU/memory resources among different
> resource queues. A little more details here: Along with existing
> FairCallQueue setup (like using 4 queues with different priorities), we would
> add some additional special queues, one for each special service user. For
> each special service user, we provide a guarantee RPC share (like 10% which
> can be aligned with its YARN resource share), and this percentage can be
> converted to a weight used in WeightedRoundRobinMultiplexer. A quick example,
> we have 4 default queues with default weights (8, 4, 2, 1), and two special
> service users (user1 with 10% share, and user2 with 15% share). So finally
> we'll have 6 queues, 4 default queues (with weights 8, 4, 2, 1) and 2 special
> queues (user1Queue weighted 15*10%/75%=2, and user2Queue weighted
> 15*15%/75%=3).
> For new coming RPC calls from special service users, they will be put
> directly to the corresponding reserved queue; for other calls, just follow
> current implementation.
> By default, there is no special user and all RPC requests follow existing
> FairCallQueue implementation.
> Would like to hear more comments on this approach; also want to know any
> other better solutions? Will put a detailed design once get some early
> comments.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]