[jira] Commented: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml
[ https://issues.apache.org/jira/browse/NUTCH-186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12560073#action_12560073 ] Andrzej Bialecki commented on NUTCH-186: - Not applicable after the code has been moved to Hadoop. mapred-default.xml is over ridden by nutch-site.xml --- Key: NUTCH-186 URL: https://issues.apache.org/jira/browse/NUTCH-186 Project: Nutch Issue Type: Bug Affects Versions: 0.8 Environment: All Reporter: Gal Nitzan Assignee: Andrzej Bialecki Priority: Minor Fix For: 0.8 Attachments: myBeautifulPatch.patch, myBeautifulPatch.patch If mapred.map.tasks and mapred.reduce.tasks are defined in nutch-site.xml and also in mapred-default.xml the definitions from nutch-site.xml are those that will take effect. So if a user mistakenly copies those entries into nutch-site.xml from the nutch-default.xml she will not understand what happens. I would like to propose removing these setting completely from the nutch-default.xml and put it only in mapred-default.xml where it belongs. I will be happy to supply a patch for that if the proposition accepted. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml
[ http://issues.apache.org/jira/browse/NUTCH-186?page=comments#action_12364010 ] Gal Nitzan commented on NUTCH-186: -- After reading the code and I think I figured it... :) The issue of the mapred-default.xml is totaly misleading. Actualy : mapred.map.tasks and mapred.reduce.tasks properties does not have any effect when placed in mapred-default.xml (unless JobConf needs it which I didnĀ“t check) because this file is loaded only when JobConf is constructed. But tasktracker is looking for these properties in nutch-site and not in mapred-default. If these properties does not exists in nutch-site.xm with the correct values for your system, these values will be picked from nutch-defaul.xml. Further, I am not sure that nutch-site.xml overiding everything should be the correct behavior. Most users knows that nutch-site.xml overides nutch-default but I think we should leave it up to them the option to override nutch-site and it will be a good start into breaking configuration to parts (ndfs and mapred are going to be seperated from nutch)... Gal mapred-default.xml is over ridden by nutch-site.xml --- Key: NUTCH-186 URL: http://issues.apache.org/jira/browse/NUTCH-186 Project: Nutch Type: Bug Versions: 0.8-dev Environment: All Reporter: Gal Nitzan Priority: Minor Attachments: myBeautifulPatch.patch, myBeautifulPatch.patch If mapred.map.tasks and mapred.reduce.tasks are defined in nutch-site.xml and also in mapred-default.xml the definitions from nutch-site.xml are those that will take effect. So if a user mistakenly copies those entries into nutch-site.xml from the nutch-default.xml she will not understand what happens. I would like to propose removing these setting completely from the nutch-default.xml and put it only in mapred-default.xml where it belongs. I will be happy to supply a patch for that if the proposition accepted. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml
[ http://issues.apache.org/jira/browse/NUTCH-186?page=comments#action_12363890 ] Andrzej Bialecki commented on NUTCH-186: - I agree. A patch would be welcome. I wonder whether it's a good idea to follow the pattern of nutch-default/nutch-site and use a pair of mapred-default/mapred-site.xml ... It would be more understandable for users. mapred-default.xml is over ridden by nutch-site.xml --- Key: NUTCH-186 URL: http://issues.apache.org/jira/browse/NUTCH-186 Project: Nutch Type: Bug Versions: 0.8-dev Environment: All Reporter: Gal Nitzan Priority: Minor If mapred.map.tasks and mapred.reduce.tasks are defined in nutch-site.xml and also in mapred-default.xml the definitions from nutch-site.xml are those that will take effect. So if a user mistakenly copies those entries into nutch-site.xml from the nutch-default.xml she will not understand what happens. I would like to propose removing these setting completely from the nutch-default.xml and put it only in mapred-default.xml where it belongs. I will be happy to supply a patch for that if the proposition accepted. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml
[ http://issues.apache.org/jira/browse/NUTCH-186?page=comments#action_12363903 ] Gal Nitzan commented on NUTCH-186: -- ok, JobConf extends NutchConf and in the (JobConf) constructor it adds the mapred-default.xml resource. the call to add resource in NutchConf actually inserts any resource file before the nutch-site.xml so there is no way to override it. look at the code at the bottom. the only thing required is to change line 85 in NutchConf to be: resourceNames.add(name); // add resouce name instead of resourceNames.add(resourceNames.size()-1, name); // add second to last and add one more line to JobConf constructor addConfResource(mapred-site.xml); This way nutch-site.xml overides nutch-default.xml but other added resources can override nutch-site.xml which in my opinion is reasonable. If acceptable I will create the patch. - current code in ButchConf.Java - public synchronized void addConfResource(File file) { addConfResourceInternal(file); } private synchronized void addConfResourceInternal(Object name) { resourceNames.add(resourceNames.size()-1, name); // add second to last properties = null;// trigger reload } mapred-default.xml is over ridden by nutch-site.xml --- Key: NUTCH-186 URL: http://issues.apache.org/jira/browse/NUTCH-186 Project: Nutch Type: Bug Versions: 0.8-dev Environment: All Reporter: Gal Nitzan Priority: Minor If mapred.map.tasks and mapred.reduce.tasks are defined in nutch-site.xml and also in mapred-default.xml the definitions from nutch-site.xml are those that will take effect. So if a user mistakenly copies those entries into nutch-site.xml from the nutch-default.xml she will not understand what happens. I would like to propose removing these setting completely from the nutch-default.xml and put it only in mapred-default.xml where it belongs. I will be happy to supply a patch for that if the proposition accepted. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira