[jira] Commented: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml

2008-01-17 Thread Andrzej Bialecki (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12560073#action_12560073
 ] 

Andrzej Bialecki  commented on NUTCH-186:
-

Not applicable after the code has been moved to Hadoop.

 mapred-default.xml is over ridden by nutch-site.xml
 ---

 Key: NUTCH-186
 URL: https://issues.apache.org/jira/browse/NUTCH-186
 Project: Nutch
  Issue Type: Bug
Affects Versions: 0.8
 Environment: All
Reporter: Gal Nitzan
Assignee: Andrzej Bialecki 
Priority: Minor
 Fix For: 0.8

 Attachments: myBeautifulPatch.patch, myBeautifulPatch.patch


 If mapred.map.tasks and mapred.reduce.tasks are defined in nutch-site.xml and 
 also in mapred-default.xml the definitions from nutch-site.xml are those that 
 will take effect.
 So if a user mistakenly copies those entries into nutch-site.xml from the 
 nutch-default.xml she will not understand what happens.
 I would like to propose removing these setting completely from the 
 nutch-default.xml and put it only in mapred-default.xml where it belongs.
 I will be happy to supply a patch for that  if the proposition accepted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml

2006-01-25 Thread Gal Nitzan (JIRA)
[ 
http://issues.apache.org/jira/browse/NUTCH-186?page=comments#action_12364010 ] 

Gal Nitzan commented on NUTCH-186:
--

After reading the code and I think I figured it... :)

The issue of the mapred-default.xml is totaly misleading.

Actualy : mapred.map.tasks and mapred.reduce.tasks properties does not have any 
effect when placed in mapred-default.xml (unless JobConf needs it which I 
didnĀ“t check) because this file is loaded only when JobConf is constructed.
But tasktracker is looking for these properties in nutch-site and not in 
mapred-default.

If these properties does not exists in nutch-site.xm with the correct values 
for your system, these values will be picked from nutch-defaul.xml.

Further, I am not sure that nutch-site.xml overiding everything should be the 
correct behavior. Most users knows that nutch-site.xml overides nutch-default 
but I think we should leave it up to them the option to override nutch-site and 
it  will be a good start into breaking configuration to parts (ndfs and mapred 
are going to be seperated from nutch)...

Gal

 mapred-default.xml is over ridden by nutch-site.xml
 ---

  Key: NUTCH-186
  URL: http://issues.apache.org/jira/browse/NUTCH-186
  Project: Nutch
 Type: Bug
 Versions: 0.8-dev
  Environment: All
 Reporter: Gal Nitzan
 Priority: Minor
  Attachments: myBeautifulPatch.patch, myBeautifulPatch.patch

 If mapred.map.tasks and mapred.reduce.tasks are defined in nutch-site.xml and 
 also in mapred-default.xml the definitions from nutch-site.xml are those that 
 will take effect.
 So if a user mistakenly copies those entries into nutch-site.xml from the 
 nutch-default.xml she will not understand what happens.
 I would like to propose removing these setting completely from the 
 nutch-default.xml and put it only in mapred-default.xml where it belongs.
 I will be happy to supply a patch for that  if the proposition accepted.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



[jira] Commented: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml

2006-01-24 Thread Andrzej Bialecki (JIRA)
[ 
http://issues.apache.org/jira/browse/NUTCH-186?page=comments#action_12363890 ] 

Andrzej Bialecki  commented on NUTCH-186:
-

I agree. A patch would be welcome.

I wonder whether it's a good idea to follow the pattern of 
nutch-default/nutch-site and use a pair of mapred-default/mapred-site.xml ... 
It would be more understandable for users.

 mapred-default.xml is over ridden by nutch-site.xml
 ---

  Key: NUTCH-186
  URL: http://issues.apache.org/jira/browse/NUTCH-186
  Project: Nutch
 Type: Bug
 Versions: 0.8-dev
  Environment: All
 Reporter: Gal Nitzan
 Priority: Minor


 If mapred.map.tasks and mapred.reduce.tasks are defined in nutch-site.xml and 
 also in mapred-default.xml the definitions from nutch-site.xml are those that 
 will take effect.
 So if a user mistakenly copies those entries into nutch-site.xml from the 
 nutch-default.xml she will not understand what happens.
 I would like to propose removing these setting completely from the 
 nutch-default.xml and put it only in mapred-default.xml where it belongs.
 I will be happy to supply a patch for that  if the proposition accepted.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira



[jira] Commented: (NUTCH-186) mapred-default.xml is over ridden by nutch-site.xml

2006-01-24 Thread Gal Nitzan (JIRA)
[ 
http://issues.apache.org/jira/browse/NUTCH-186?page=comments#action_12363903 ] 

Gal Nitzan commented on NUTCH-186:
--

ok, JobConf extends NutchConf and in the (JobConf) constructor it adds the 
mapred-default.xml resource.

the call to add resource in NutchConf actually inserts any resource file before 
the nutch-site.xml so there is no way to override it. look at the code at the 
bottom.

the only thing required is to change line 85 in NutchConf to be:

resourceNames.add(name); // add resouce name

instead of

resourceNames.add(resourceNames.size()-1, name); // add second to last

and add one more line to JobConf constructor

addConfResource(mapred-site.xml);


This way nutch-site.xml overides nutch-default.xml but other added resources 
can override nutch-site.xml which in my opinion is reasonable.

If acceptable I will create the patch.


- current code in ButchConf.Java 
-
  public synchronized void addConfResource(File file) {
addConfResourceInternal(file);
  }
  private synchronized void addConfResourceInternal(Object name) {
resourceNames.add(resourceNames.size()-1, name); // add second to last
properties = null;// trigger reload
  }


 mapred-default.xml is over ridden by nutch-site.xml
 ---

  Key: NUTCH-186
  URL: http://issues.apache.org/jira/browse/NUTCH-186
  Project: Nutch
 Type: Bug
 Versions: 0.8-dev
  Environment: All
 Reporter: Gal Nitzan
 Priority: Minor


 If mapred.map.tasks and mapred.reduce.tasks are defined in nutch-site.xml and 
 also in mapred-default.xml the definitions from nutch-site.xml are those that 
 will take effect.
 So if a user mistakenly copies those entries into nutch-site.xml from the 
 nutch-default.xml she will not understand what happens.
 I would like to propose removing these setting completely from the 
 nutch-default.xml and put it only in mapred-default.xml where it belongs.
 I will be happy to supply a patch for that  if the proposition accepted.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira