[
https://issues.apache.org/jira/browse/NUTCH-3103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17924533#comment-17924533
]
ASF GitHub Bot commented on NUTCH-3103:
---------------------------------------
martin-djukanovic opened a new pull request, #848:
URL: https://github.com/apache/nutch/pull/848
1) The loop in setHostSpecificIntervals is cleaned up and if max interval in
the config is set to default, it is treated correctly.
2) The functions getMinInterval and getMaxInterval are respectively renamed
to getCustomMinInterval and getCustomMaxInterval and now return null if no
custom interval has been set for the given URL's hostname. If one of them
returns null after it is called, then the corresponding default value will be
used to bound the calculated interval.
3) The custom interval values in the config are now allowed to equal the
default values. For example, if the default min interval is 7200 then in the
config file "0", "default" and "7200" are all valid values for the custom min
interval, and they all have the same result.
4) The config file template is changed to account for these changes.
> Improper fetch interval given as example
> ----------------------------------------
>
> Key: NUTCH-3103
> URL: https://issues.apache.org/jira/browse/NUTCH-3103
> Project: Nutch
> Issue Type: Bug
> Components: documentation, fetcher
> Affects Versions: 1.20
> Reporter: Isabelle Giguere
> Assignee: Markus Jelsma
> Priority: Trivial
> Fix For: 1.21
>
>
> This error is logged :
> org.apache.nutch.crawl.AdaptiveFetchSchedule - Improper fetch intervals given
> on line 13 in the config. file: www.example.org 1296000 0
> See conf/adaptive-host-specific-intervals.txt
> Note that 'default' as max interval also produces an error.
> Trivial fix: remove the example
> But the question remains: Is '0' and 'default' accepted for max interval ?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)