[jira] Commented: (HADOOP-785) Divide the server and client configurations

Sameer Paranjpye (JIRA) Wed, 15 Aug 2007 13:43:54 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520088
 ]


Sameer Paranjpye commented on HADOOP-785:
-----------------------------------------

{quote}
It does make sense to have overrideable values on the server too, e.g., to 
determine the default block size for client programs which don't override it. 
Under Arun's proposal this would be in hadoop-initial.xml on the servers. Where 
would it be in your proposal? As items in hadoop-server.xml that are not named 
in hadoop.client.override? Is this really less confusing?
{quote}

The default block size for client programs would be in _hadoop-client.xml_, 
settings in this file would override those in _hadoop-defaults.xml_. 

{quote}
Another issue with your proposal is that it requires different Configuration 
construction code on clients and servers. Do we always know, everywhere that a 
Configuration is created, whether we are running as a client or a server?
{quote}

I proposed the client-server nomenclature because I feel it makes the system 
more comprehensible.  Admittedly, the distinction between clients and servers 
isn't always clear, but the proposed filenames are intended to map elements of 
configuration to system components and the people that configure them. The file 
_hadoop-client.xml_ is supplied by "users" - people that run map/reduce jobs 
and is read by "clients" i.e. jobs, tasks and the shell. The file 
_hadoop-server.xml_ is supplied by "admins" - people that keep Hadoop clusters 
up and running and is read by servers. Depending on the context either 
_hadoop-client.xml_ or _hadoop-server.xml_ would be the "final resource" read 
by a Configuration object. There is no technical reason for these files to be 
named differently, indeed currently they are not, _hadoop-site.xml_ is the 
final resource read by both clients and servers. We could even have 3 files, 
_hadoop-client.xml_, _hadoop-mapred.xml_ and _hadoop-dfs.xml_ read by clients, 
map/reduce servers and HDFS servers respectively. It would require some 
differences in Configuration construction code, but these don't appear to be 
too convoluted. The name of the final resource consumed could be set by clients 
and servers upon start-up and then used by all Configuration objects 
constructed by the servers. The final resource could also be overridden by 
values supplied on the command line.








  

> Divide the server and client configurations
> -------------------------------------------
>
>                 Key: HADOOP-785
>                 URL: https://issues.apache.org/jira/browse/HADOOP-785
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 0.9.0
>            Reporter: Owen O'Malley
>            Assignee: Arun C Murthy
>             Fix For: 0.15.0
>
>
> The configuration system is easy to misconfigure and I think we need to 
> strongly divide the server from client configs. 
> An example of the problem was a configuration where the task tracker has a 
> hadoop-site.xml that set mapred.reduce.tasks to 1. Therefore, the job tracker 
> had the right number of reduces, but the map task thought there was a single 
> reduce. This lead to a hard to find diagnose failure.
> Therefore, I propose separating out the configuration types as:
> class Configuration;
> // reads site-default.xml, hadoop-default.xml
> class ServerConf extends Configuration;
> // reads hadoop-server.xml, $super
> class DfsServerConf extends ServerConf;
> // reads dfs-server.xml, $super
> class MapRedServerConf extends ServerConf;
> // reads mapred-server.xml, $super
> class ClientConf extends Configuration;
> // reads hadoop-client.xml, $super
> class JobConf extends ClientConf;
> // reads job.xml, $super
> Note in particular, that nothing corresponds to hadoop-site.xml, which 
> overrides both client and server configs. Furthermore, the properties from 
> the *-default.xml files should never be saved into the job.xml.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-785) Divide the server and client configurations

Reply via email to