[ 
https://issues.apache.org/jira/browse/NUTCH-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13186212#comment-13186212
 ] 

Andrzej Bialecki  commented on NUTCH-1247:
------------------------------------------

Indeed, line 264 increases the retry counter, but after it reaches retryMax 
then page status is set to DB_GONE, so it won't be generated again until it 
expires, and its retry counter won't increase. Once it expires then Generator 
should invoke FetchSchedule.forceRefetch on this page, and the default 
implementation resets the retry counter. So either there's some bug in this 
cycle, or your retryMax is greater than 127.
                
> CrawlDatum.retries should be int
> --------------------------------
>
>                 Key: NUTCH-1247
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1247
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.4
>            Reporter: Markus Jelsma
>             Fix For: 1.5
>
>
> CrawlDatum.retries is a byte and goes bad with larger values.
> 12/01/12 18:35:22 INFO crawl.CrawlDbReader: retry -127: 1
> 12/01/12 18:35:22 INFO crawl.CrawlDbReader: retry -128: 1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to