Markus Jelsma created NUTCH-1617:
------------------------------------

             Summary: IndexerMapReduce to consider latest fetchDatum
                 Key: NUTCH-1617
                 URL: https://issues.apache.org/jira/browse/NUTCH-1617
             Project: Nutch
          Issue Type: Bug
    Affects Versions: 1.7
            Reporter: Markus Jelsma
            Assignee: Markus Jelsma
             Fix For: 1.8


IndexerMapReduce can skip not_modified or delete redirects and gone records but 
it only considers the first incoming fetchDatum. Instead, it should consider 
the last fetchDatum only based on CrawlDatum.fetchTime.

This affect indexing of multiple segments only.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to