[ http://issues.apache.org/jira/browse/NUTCH-186?page=comments#action_12363998 ]
Doug Cutting commented on NUTCH-186: ------------------------------------ The config rules at present are: 1. All user-settable values should be in nutch-default.xml, as documentation that they exist. Any other config will override this. This file should not be altered by users. 2. nutch-site.xml is always loaded last, overriding all other options. This is empty by default. mapred-default.xml was added specifically to permit the specification of things that a job can override. I think the fix that's needed here is documentation. The documentation for these parameters should perhaps caution against putting them in nutch-site.xml, and point folks towards mapred-default.xml. We might eventually move to a more complex configuration, where we break things into modules, each with three parts: base, default, final. So there could be a mapred-base.xml that listed all of the settable mapred parameters. Then the overridable defauld value could be set in mapred-default.xml. And non-overrideable values (e.g., the jobtracker host) could be specified in mapred-final. > mapred-default.xml is over ridden by nutch-site.xml > --------------------------------------------------- > > Key: NUTCH-186 > URL: http://issues.apache.org/jira/browse/NUTCH-186 > Project: Nutch > Type: Bug > Versions: 0.8-dev > Environment: All > Reporter: Gal Nitzan > Priority: Minor > Attachments: myBeautifulPatch.patch > > If mapred.map.tasks and mapred.reduce.tasks are defined in nutch-site.xml and > also in mapred-default.xml the definitions from nutch-site.xml are those that > will take effect. > So if a user mistakenly copies those entries into nutch-site.xml from the > nutch-default.xml she will not understand what happens. > I would like to propose removing these setting completely from the > nutch-default.xml and put it only in mapred-default.xml where it belongs. > I will be happy to supply a patch for that if the proposition accepted. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
