[jira] Commented: (HADOOP-785) Divide the server and client configurations

Doug Cutting (JIRA) Wed, 15 Aug 2007 15:17:51 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520113
 ]


Doug Cutting commented on HADOOP-785:
-------------------------------------

It sounds like we're mostly in agreement.  We agree that there should be a file 
with values that jobs can override, and that's where most things should go.  
There also needs to be a way for mapreduce daemons to list parameters that may 
not be overridden by jobs.  Where we differ is what the files should be named 
and how the non-overrideable parameters should be named.  These seems like 
mostly cosmetic differences that should be easily resolved by reasonable folks.

I'd prefer it if:

 - The override mechanism is not specific to mapreduce, since other daemons may 
wish to use it in the future.  We should also avoid the terms 'client' and 
'server', since these are relative, not universal.

-  The override specification format is merge-friendly, since, e.g., both 
mapreduce and hdfs may have values that jobs should not override, and changes 
to the file should be easy to see with, e.g., 'diff'.  In other words, 
different parameters and values should be on different lines.

Finally, it makes sense to me that the value which cannot be overridden could 
be set in the same place where it is declared to be not overrideable.  So, 
unless we want to invent a new file format, this sounds a lot like a special 
config file.

> Divide the server and client configurations
> -------------------------------------------
>
>                 Key: HADOOP-785
>                 URL: https://issues.apache.org/jira/browse/HADOOP-785
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.9.0
>            Reporter: Owen O'Malley
>            Assignee: Arun C Murthy
>             Fix For: 0.15.0
>
>
> The configuration system is easy to misconfigure and I think we need to 
> strongly divide the server from client configs. 
> An example of the problem was a configuration where the task tracker has a 
> hadoop-site.xml that set mapred.reduce.tasks to 1. Therefore, the job tracker 
> had the right number of reduces, but the map task thought there was a single 
> reduce. This lead to a hard to find diagnose failure.
> Therefore, I propose separating out the configuration types as:
> class Configuration;
> // reads site-default.xml, hadoop-default.xml
> class ServerConf extends Configuration;
> // reads hadoop-server.xml, $super
> class DfsServerConf extends ServerConf;
> // reads dfs-server.xml, $super
> class MapRedServerConf extends ServerConf;
> // reads mapred-server.xml, $super
> class ClientConf extends Configuration;
> // reads hadoop-client.xml, $super
> class JobConf extends ClientConf;
> // reads job.xml, $super
> Note in particular, that nothing corresponds to hadoop-site.xml, which 
> overrides both client and server configs. Furthermore, the properties from 
> the *-default.xml files should never be saved into the job.xml.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-785) Divide the server and client configurations

Reply via email to