[ 
https://issues.apache.org/jira/browse/HADOOP-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520051
 ] 

Sameer Paranjpye commented on HADOOP-785:
-----------------------------------------

{quote}
Essentially we need 3 config files:
a) Read-only defaults (existing hadoop-defaults.xml).
b) A file where the admin specifies config values which can be overridden 
(existing mapred-defaults.xml).
c) A file where the admin specifies a set of hard, sane limits for some config 
values which cannot be overridden (existing hadoop-site.xml).
{quote}

I don't think we need 3 config files or a hierarchy of configs. The above 3 
categories of configuration need to exist, but can be expressed in many 
different ways. What if we had the following files:

- _hadoop-defaults.xml_, the read-only default config file
- _hadoop-client.xml_, specifies client behavior, resides on a client machine 
is processed by clients
- _hadoop-server.xml_, specifies server behavior is processed by servers

The one place where the client and server configs interact is when tasks are 
localized and clients are running in a server controlled context. Here some of 
the clients configuration can be overridden by values in the servers config. 
The variables to be overridden can be hard coded. If this means we're 
overprotecting users, then  the list of variables to override can itself be 
placed in the server config, say in the hadoop.client.overrides config variable.

The treatment of the 3 categories of config values would be as follows:

- Read-only defaults - _hadoop-defaults.xml_
- Admin specified config values which can be overridden - This set of values no 
longer exists, everything can be overridden by clients with a few exceptions, 
all client configuration appears in _hadoop-client.xml_
- Admin specified set of hard, sane limits for some config values which cannot 
be overridden - This is a set of exceptions listed in _hadoop-server.xml_, 
represented by the config value _hadoop.client.override_



 

> Divide the server and client configurations
> -------------------------------------------
>
>                 Key: HADOOP-785
>                 URL: https://issues.apache.org/jira/browse/HADOOP-785
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.9.0
>            Reporter: Owen O'Malley
>            Assignee: Arun C Murthy
>             Fix For: 0.15.0
>
>
> The configuration system is easy to misconfigure and I think we need to 
> strongly divide the server from client configs. 
> An example of the problem was a configuration where the task tracker has a 
> hadoop-site.xml that set mapred.reduce.tasks to 1. Therefore, the job tracker 
> had the right number of reduces, but the map task thought there was a single 
> reduce. This lead to a hard to find diagnose failure.
> Therefore, I propose separating out the configuration types as:
> class Configuration;
> // reads site-default.xml, hadoop-default.xml
> class ServerConf extends Configuration;
> // reads hadoop-server.xml, $super
> class DfsServerConf extends ServerConf;
> // reads dfs-server.xml, $super
> class MapRedServerConf extends ServerConf;
> // reads mapred-server.xml, $super
> class ClientConf extends Configuration;
> // reads hadoop-client.xml, $super
> class JobConf extends ClientConf;
> // reads job.xml, $super
> Note in particular, that nothing corresponds to hadoop-site.xml, which 
> overrides both client and server configs. Furthermore, the properties from 
> the *-default.xml files should never be saved into the job.xml.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to