Generator should not generate filter and not found and denied and gone and
permanently moved pages
--------------------------------------------------------------------------------------------------
Key: NUTCH-1288
URL: https://issues.apache.org/jira/browse/NUTCH-1288
Project: Nutch
Issue Type: Bug
Components: fetcher, generator
Affects Versions: 1.4
Reporter: behnam nikbakht
Generator should not generate filter and not found and denied and gone and
permanently moved pages.
in the shouldFetch method in AbstractFetchSchedule, CrawlDatum must checked
against special states of fetch like not found, and not generate them again.
so we can add a status in CrawlDatum that indicates invalid urls, and set this
status in fetch.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira