[
https://issues.apache.org/jira/browse/HADOOP-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606628#action_12606628
]
Vivek Ratan commented on HADOOP-3479:
-------------------------------------
Doug, when we (mostly Hemanth and I) thought about configuration for queues, we
felt we had a somewhat unique situation: the number of queues is dynamic, and
each queue has different values for a common set of attributes . We felt we had
two fundamental implementation choices:
* use the 'flatness' of the existing Configuration class and config format to
reflect a more hierarchical config structure that queues inherently need (by
'flatness', I mean that we don't have an obvious way today to express
hierarchies or a dynamic number of entries).
* extend the Configuration class and its supporting classes to handle
hierarchies and dynamic number of properties
We looked hard at the first option and considered some strategies, including
some of the ones you've mentioned. In fact, I sent out a mail to code-dev on
Tue, 6/3, suggesting an option of having a property that lists comma-separated
queue names or number of queues, and then building the property name string for
each queue attribute and getting its value from the config file. The pros are
obvious: we use the existing framework and do not duplicate code. But we felt
like the configuration framework needed to fundamentally handle hierarchies and
dynamic number of properties. Arguably, it's only for queues today, but later,
if/when we support hierarchies of Orgs/queues/users and perhaps overwriting of
default values (lower-level entities in the hierarchies can override
higher-level defaults; for example, queues can specify some defaults for users
which some users can override), we will need such functionality.
Now, if we need this functionality, we felt we had two options:
* we could alter the Configuration class to support the new features, then
perhaps have a QueueConf subclass, just like we have one for JobConf.
* we could build this functionality in the QueueConf class separately to see
how well it works
There is no doubt that the first approach is a better longer-term solution.
However, we didn't really want to change the Configuration class too much at
this stage. It's a core class, and given that this whole stuff about queues and
orgs is fairly new and we don't know how much it will change over time, we felt
that we should restrict our modifications to the QueueConfig class for now.
This isolates the more stable Configuration class from too many changes. It's
also something we can do faster. Once we feel we have it right, we do want to
do the right thing long-term, which is to build support in the Configuration
class. Furthermore, people haven't responded very much to our proposal, and
given that configuration is usually one of those areas where there are lots of
string views, we wanted to keep the code impact minimal, pending further
discussions. The flip side is duplication of code, which you have pointed out.
And there is always the danger that once things get into the code base, they're
often not modified according to their original intent, but that's something we
need to be disciplined about. But eventually, we do want to go with the first
option.
I'm personally not against using the current flat config structure to handle
queues, at least until we have more use cases for hierarchical configuration.
But I think that by limiting the new code to a separate class, and the new
configuration to a separate file, we isolate ourselves against too many changes
to code in the future, till we get our use cases right. And the configuration
format that Hemanth has proposed is more compact and easier to understand than
having a separate property for every attribute of every queue.
These are the reasons for our proposal. We'd love to hear more. Hemanth's been
soliciting comments for a while :)
> Implement configuration items useful for Hadoop resource manager (v1)
> ---------------------------------------------------------------------
>
> Key: HADOOP-3479
> URL: https://issues.apache.org/jira/browse/HADOOP-3479
> Project: Hadoop Core
> Issue Type: New Feature
> Reporter: Hemanth Yamijala
> Assignee: Hemanth Yamijala
> Attachments: 3479.1.patch, 3479.patch
>
>
> HADOOP-3421 lists requirements for a new resource manager for Hadoop.
> Implementation for these will require support for new configuration items in
> Hadoop. This JIRA is to define such configuration, and track it's
> implementation.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.