On Jun 20, 2006, at 9:29 AM, Paul Sutter wrote:
Speaking of configuration, is there any clear definition for the
purpose of
mapred-default.xml? My understanding is that its an alternate,
misnamed,
site-local configuration, but we're not sure what to do with it.
Right now, we make all of our changes to hadoop-site.xml, then copy
that
file to mapred-default.xml because we've heard that sometimes, that
file
gets checked instead of hadoop-site.xml.
Any help appreciated
My general approach is that only things that the user/application
should never change are in hadoop-site. Largely, this is limited to the
namenode/jobtracker addresses, port, and directories. Everything else
goes into mapred-default.xml. This includes things like:
dfs.block.size
io.sort.factor
io.sort.mb
etc....
This happens because of the load order of the config files:
hadoop-default.xml, mapred-default.xml, job.xml, hadoop-site.xml.
so job.xml will override the default files, but NOT the hadoop-site. I
think that mapred-default would be better named site-default or
something.
-- Owen