[ 
https://issues.apache.org/jira/browse/NUTCH-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555927#comment-13555927
 ] 

lufeng commented on NUTCH-1519:
-------------------------------

Hi Lewis, do you mean that properties shoud be defined in nutch-defaut.xml 
rather rather dynamic load from command input.
                
> Configuration Overrides not in sync between WebTableReader and 
> nutch-default.xml 
> ---------------------------------------------------------------------------------
>
>                 Key: NUTCH-1519
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1519
>             Project: Nutch
>          Issue Type: Bug
>          Components: crawldb, storage
>    Affects Versions: 2.1
>            Reporter: Lewis John McGibbney
>            Priority: Minor
>             Fix For: 2.2
>
>
> In 2.x HEAD the WebTableReader class [0] provides Overrides for properties 
> such as 
> {code}
> currentJob.getConfiguration().setBoolean("mapreduce.fileoutputcommitter.marksuccessfuljobs",
>  false);
> currentJob.getConfiguration().setBoolean("db.reader.stats.sort", sort);
> {code}
> as well as
> {code}
> Configuration cfg = job.getConfiguration();
>     cfg.set(WebTableRegexMapper.regexParamName, regex);
>     cfg.setBoolean(WebTableRegexMapper.contentParamName, content);
>     cfg.setBoolean(WebTableRegexMapper.headersParamName, headers);
>     cfg.setBoolean(WebTableRegexMapper.linksParamName, links);
>     cfg.setBoolean(WebTableRegexMapper.textParamName, text);
> {code}
> None of these are actually present and therefore configurable an able to be 
> Overridden.
> This should be sorted out.
> [0] 
> http://svn.apache.org/repos/asf/nutch/branches/2.x/src/java/org/apache/nutch/crawl/WebTableReader.java

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to