[ 
https://issues.apache.org/jira/browse/NUTCH-1503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13529497#comment-13529497
 ] 

Sebastian Nagel commented on NUTCH-1503:
----------------------------------------

Hi Lewis,
both time limit properties are necessary:
* fetcher.timelimit.mins for the user to configure the limit (max. duration in 
minutes)
* fetcher.timelimit (internal use only) to set the time the fetcher has to 
finish (system time in milliseconds, same time for all distributed jobs)

Regarding fetcher.threads.per.host.by.ip: maybe we should not add already 
deprecated properties which will be removed later anyway (cf. NUTCH-1409).
+1 for adding fetcher.queue.use.host.settings to nutch-default.xml
Btw., your efforts to clean up properties remembered me that some time ago I 
promised on 
[user@nutch|http://lucene.472066.n3.nabble.com/Javadoc-incorrect-or-missing-code-in-1-5-1-Generator-td3997883.html]
 to prepare a list with all Nutch properties and flags whether they are 
"defined" and documented in nutch-default.xml: [it's in the wiki 
now|http://wiki.apache.org/nutch/NutchPropertiesCompleteList].

                
> Configuration properties not in sync between FetcherReducer and 
> nutch-default.xml
> ---------------------------------------------------------------------------------
>
>                 Key: NUTCH-1503
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1503
>             Project: Nutch
>          Issue Type: Bug
>          Components: fetcher
>    Affects Versions: 2.1
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Minor
>             Fix For: 2.2
>
>         Attachments: NUTCH-1503.patch
>
>
> FetcherReducer.java
> Bug: Following properties appear in FetcherReducer but not in 
> nutch-default.xml
> {code}
> 290       useHostSettings = 
> conf.getBoolean("fetcher.queue.use.host.settings", false);
> 300       this.timelimit = conf.getLong("fetcher.timelimit", -1);
> 450       this.byIP = conf.getBoolean("fetcher.threads.per.host.by.ip", true);
> 698       timelimit = context.getConfiguration().getLong("fetcher.timelimit", 
> -1); 
> {code}
> Therefore they cannot be used properly in code execution and must be updated, 
> removed and/or added to nutch-default.xml.
> Patch coming up just now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to