[
https://issues.apache.org/jira/browse/NUTCH-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12578705#action_12578705
]
Andrzej Bialecki commented on NUTCH-616:
-
I'm considering a different approach to this patch. There are already 2 Fetcher
implementations, and in the future we may want to go even more modular, so
patching this issue in every fetching tool doesn't seem appropriate. IMHO this
should be handled in the CrawlDb maintenance tools (i.e. CrawlDbReducer). Patch
is forthcoming.
Reset Fetch Retry counter when fetch is successful
--
Key: NUTCH-616
URL: https://issues.apache.org/jira/browse/NUTCH-616
Project: Nutch
Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Emmanuel Joke
Fix For: 1.0.0
Attachments: NUTCH-616.patch
We manage a counter to check how many time the URL has been consecutively in
state Retry following some trouble to get the page.
Here is a sample of the code:
case ProtocolStatus.RETRY: // retry
fit.datum.setRetriesSinceFetch(fit.datum.getRetriesSinceFetch()+1);
However i notice that we don't reinitialize this counter at 0 in the case of
successful fetch.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.