On Jun 20, 2006, at 5:10 PM, Paul Sutter wrote:
Thanks very much for the explanation, and to confirm I will repeat it:
The first occurence of a parameter is used, and the search order is:
hadoop-site.xml, then
job.xml, then
mapred-default.xml, then
hadoop-default.xml
Thats great, and it explains behavior that had been confusing before.
Exactly correct. One other piece that can cause confusion is that all
of the files are found via the java class path. And they are present
both in the conf directory in the distribution AND the hadoop-*.jar
file.
One side effect of this is that I recommend never having a copy of
hadoop-default.xml in your config directory. That is the one
configuration file that you always want updated automatically when you
update your distribution.
For the record, I like setting up my hadoop directories like:
$hadoop_prefix/hadoop-0.4-dev # distribution directory
$hadoop_prefix/conf # local configuration
$hadoop_prefix/current # sym link over to the
distribution directory
$hadoop_prefix/run/log # log directory
$hadoop_prefix/run/pid # pid directory
$hadoop_prefix/run/mapred # map-reduce server directory
$hadoop_prefix/run/dfs/{data,name} # dfs server directories
-- Owen