[
https://issues.apache.org/jira/browse/NUTCH-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186212#comment-13186212
]
Andrzej Bialecki commented on NUTCH-1247:
------------------------------------------
Indeed, line 264 increases the retry counter, but after it reaches retryMax
then page status is set to DB_GONE, so it won't be generated again until it
expires, and its retry counter won't increase. Once it expires then Generator
should invoke FetchSchedule.forceRefetch on this page, and the default
implementation resets the retry counter. So either there's some bug in this
cycle, or your retryMax is greater than 127.
> CrawlDatum.retries should be int
> --------------------------------
>
> Key: NUTCH-1247
> URL: https://issues.apache.org/jira/browse/NUTCH-1247
> Project: Nutch
> Issue Type: Bug
> Affects Versions: 1.4
> Reporter: Markus Jelsma
> Fix For: 1.5
>
>
> CrawlDatum.retries is a byte and goes bad with larger values.
> 12/01/12 18:35:22 INFO crawl.CrawlDbReader: retry -127: 1
> 12/01/12 18:35:22 INFO crawl.CrawlDbReader: retry -128: 1
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira