Indexer not to reindex unmodified docs
--------------------------------------

                 Key: NUTCH-1322
                 URL: https://issues.apache.org/jira/browse/NUTCH-1322
             Project: Nutch
          Issue Type: Improvement
          Components: indexer
    Affects Versions: 1.4
            Reporter: Markus Jelsma
            Assignee: Markus Jelsma


IndexerMapReduce already attempts not to index unmodified pages if their fetch 
status is set to unmodified. This, however, doesn't always work. Some documents 
do not have that fetch status but are actually not modified at all.

The indexer should optionally be able not to reindex these pages.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to