Markus Jelsma created NUTCH-1617:
------------------------------------
Summary: IndexerMapReduce to consider latest fetchDatum
Key: NUTCH-1617
URL: https://issues.apache.org/jira/browse/NUTCH-1617
Project: Nutch
Issue Type: Bug
Affects Versions: 1.7
Reporter: Markus Jelsma
Assignee: Markus Jelsma
Fix For: 1.8
IndexerMapReduce can skip not_modified or delete redirects and gone records but
it only considers the first incoming fetchDatum. Instead, it should consider
the last fetchDatum only based on CrawlDatum.fetchTime.
This affect indexing of multiple segments only.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira