Indexer not to reindex unmodified docs
--------------------------------------
Key: NUTCH-1322
URL: https://issues.apache.org/jira/browse/NUTCH-1322
Project: Nutch
Issue Type: Improvement
Components: indexer
Affects Versions: 1.4
Reporter: Markus Jelsma
Assignee: Markus Jelsma
IndexerMapReduce already attempts not to index unmodified pages if their fetch
status is set to unmodified. This, however, doesn't always work. Some documents
do not have that fetch status but are actually not modified at all.
The indexer should optionally be able not to reindex these pages.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira