Generator should not generate filter and not found and denied and gone and 
permanently moved pages
--------------------------------------------------------------------------------------------------

                 Key: NUTCH-1288
                 URL: https://issues.apache.org/jira/browse/NUTCH-1288
             Project: Nutch
          Issue Type: Bug
          Components: fetcher, generator
    Affects Versions: 1.4
            Reporter: behnam nikbakht


Generator should not generate filter and not found and denied and gone and 
permanently moved pages.
in the shouldFetch method in AbstractFetchSchedule, CrawlDatum must checked 
against special states of fetch like not found, and not generate them again.
so we can add a status in CrawlDatum that indicates invalid urls, and set this 
status in fetch.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to