[ https://issues.apache.org/jira/browse/HADOOP-785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520092 ]
Doug Cutting commented on HADOOP-785: ------------------------------------- bq. The default block size for client programs would be in hadoop-client.xml [ ...] Where would the default block size for server programs be set? In hadoop-server.xml? It sounds like you want to break what Arun's calling hadoop-initial.xml into two files: a client and server version, and replace hadoop-final.xml with a parameter that names those values which may not be overridden, but that parameter is only used on "servers"? Is that a fair comparison? My belief is that the primary reason we've seen misconfiguration that that folks don't understand that hadoop-site.xml is not overrideable on servers by jobs. We've encouraged folks to put most things in that file (hadoop-site.xml), when in fact it should only be used for very limited purposes, mostly for host-specific paths. This has caused many serious problems. But we shouldn't overreact. We should fix this issue. We should make it clearer where most things belong, and what particular things should not be overrideable. The root of the problem might be: http://lucene.apache.org/hadoop/api/overview-summary.html#overview_description This is where we've first encouraged all users of Hadoop to edit the wrong file. I don't think that, long-term, client and server are fundamental distinctions in Hadoop, we run clients on servers and will probably do the converse someday, so I am hesitant to hardwire these in as fundamental concepts in the configuration system, which is fundamental. I think the notion of host-specific settings which cannot be overridden is a universal concept and would rather focus on making that distinction clear to users. > Divide the server and client configurations > ------------------------------------------- > > Key: HADOOP-785 > URL: https://issues.apache.org/jira/browse/HADOOP-785 > Project: Hadoop > Issue Type: Improvement > Components: conf > Affects Versions: 0.9.0 > Reporter: Owen O'Malley > Assignee: Arun C Murthy > Fix For: 0.15.0 > > > The configuration system is easy to misconfigure and I think we need to > strongly divide the server from client configs. > An example of the problem was a configuration where the task tracker has a > hadoop-site.xml that set mapred.reduce.tasks to 1. Therefore, the job tracker > had the right number of reduces, but the map task thought there was a single > reduce. This lead to a hard to find diagnose failure. > Therefore, I propose separating out the configuration types as: > class Configuration; > // reads site-default.xml, hadoop-default.xml > class ServerConf extends Configuration; > // reads hadoop-server.xml, $super > class DfsServerConf extends ServerConf; > // reads dfs-server.xml, $super > class MapRedServerConf extends ServerConf; > // reads mapred-server.xml, $super > class ClientConf extends Configuration; > // reads hadoop-client.xml, $super > class JobConf extends ClientConf; > // reads job.xml, $super > Note in particular, that nothing corresponds to hadoop-site.xml, which > overrides both client and server configs. Furthermore, the properties from > the *-default.xml files should never be saved into the job.xml. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.