[
https://issues.apache.org/jira/browse/NUTCH-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15279942#comment-15279942
]
Sebastian Nagel commented on NUTCH-2242:
----------------------------------------
[~markus17]: Sorry, I didn't upload a final patch, simply because the solution
on github (see
[diff|https://github.com/apache/nutch/compare/master...sebastian-nagel:NUTCH-2164])
was not finally tested. I'll prepare a final patch / pull request.
[~jurian]: Setting the modified time in CrawlDb is done by
AdaptiveFetchSchedule and (now) by DefaultFetchSchedule. It does not really
make sense to do this twice. Also, (if done at this place) it would overwrite
the modified time, e.g., detected by a signature comparison.
> lastModified not always set
> ---------------------------
>
> Key: NUTCH-2242
> URL: https://issues.apache.org/jira/browse/NUTCH-2242
> Project: Nutch
> Issue Type: Bug
> Components: crawldb
> Affects Versions: 1.11
> Reporter: Jurian Broertjes
> Priority: Minor
> Fix For: 1.12
>
> Attachments: NUTCH-2242.patch
>
>
> I observed two issues:
> - When using the DefaultFetchSchedule, CrawlDatum's modifiedTime field is not
> updated on the first successful fetch.
> - When a document modification is detected (protocol- or signature-wise), the
> modifiedTime isn't updated
> I can provide a patch later today.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)