[
https://issues.apache.org/jira/browse/HADOOP-3479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607651#action_12607651
]
Hemanth Yamijala commented on HADOOP-3479:
------------------------------------------
After some internal discussion, and considering Doug's suggestions in detail, I
am summarizing the choices we have, and some recommendations.
There are 3 things considered: storage, format and API.
For storage, there are the following options:
- Store all resource manager configuration in a separate file.
- Store the configuration in different files, one per queue, and maybe one for
the default. This could help us completely reuse the Configuration class.
However, the implementation may require multiple Configuration objects to be
created, and does not scale as well as the other option. Also, managing
multiple files could be a problem.
For format, there are the following options:
- Store in a hierarchical fashion as described above (and in the first patch
uploaded). This is an intuitive format, but requires special parsing and code
duplication with the Configuration class.
- Store in a flat format using the property naming convention as suggested by
Doug. For e.g., it would look something like this:
{code:xml}
<property>
<name>hadoop.rm.queues</name>
<value>q1,q2</value>
</property>
<property>
<name>hadoop.rm.max-capacity</name>
<value>200</value>
</property>
<property>
<name>q1.hadoop.rm.max-capacity</name>
<value>100</value>
</property>
<property>
<name>q2.hadoop.rm.max-capacity</name>
<value>100</value>
</property>
{code}
This format has the advantage that the existing Configuration mechanism can be
reused to a large extent. However, it *may* be cumbersome for admins to manage.
For the API, there should be a separate class like ResourceManagerConf that
provides an API like below:
{code}
Set<String> getQueues();
int getMaxCapacity(String queueName);
void setMaxCapacity(String queueName, int capacity);
{code}
Having such an API makes it easy for the client, currently the JobTracker to
access Queue configuration.
We have two options:
- Make this package private. Has the advantage and flexibility that it can be
changed until we get sufficient use cases for making it public, and/or the API
becomes stable.
- Make this a new public API. Personally, I don't see a big advantage of doing
this.
As a side note, I made a mistake of adding this to the core conf package in my
earlier patch. This is not required, and should be in the package along with
other resource manager classes.
In summary, I would prefer a design that looks as follows:
- Use a separate file to store the configuration
- Store properties using a special naming convention, so as to keep the
hierarchy flat and allow for reuse of the Configuration class.
- Define a new package-private class that exposes the properties using getters
and setters.
The implementation of this class could use the Configuration class underneath
to parse and store the values, and retrieve them using Configuration's API.
Comments ?
> Implement configuration items useful for Hadoop resource manager (v1)
> ---------------------------------------------------------------------
>
> Key: HADOOP-3479
> URL: https://issues.apache.org/jira/browse/HADOOP-3479
> Project: Hadoop Core
> Issue Type: New Feature
> Reporter: Hemanth Yamijala
> Assignee: Hemanth Yamijala
> Attachments: 3479.1.patch, 3479.patch
>
>
> HADOOP-3421 lists requirements for a new resource manager for Hadoop.
> Implementation for these will require support for new configuration items in
> Hadoop. This JIRA is to define such configuration, and track it's
> implementation.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.