On Jun 20, 2006, at 5:10 PM, Paul Sutter wrote:

Thanks very much for the explanation, and to confirm I will repeat it:

The first occurence of a parameter is used, and the search order is:

hadoop-site.xml, then
job.xml, then
mapred-default.xml, then
hadoop-default.xml

Thats great, and it explains behavior that had been confusing before.

Exactly correct. One other piece that can cause confusion is that all of the files are found via the java class path. And they are present both in the conf directory in the distribution AND the hadoop-*.jar file.

One side effect of this is that I recommend never having a copy of hadoop-default.xml in your config directory. That is the one configuration file that you always want updated automatically when you update your distribution.

For the record, I like setting up my hadoop directories like:

$hadoop_prefix/hadoop-0.4-dev     # distribution directory
$hadoop_prefix/conf                          # local configuration
$hadoop_prefix/current # sym link over to the distribution directory
$hadoop_prefix/run/log                     # log directory
$hadoop_prefix/run/pid                     # pid directory
$hadoop_prefix/run/mapred             # map-reduce server directory
$hadoop_prefix/run/dfs/{data,name} # dfs server directories

-- Owen

Reply via email to