[
https://issues.apache.org/jira/browse/NUTCH-2242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16239112#comment-16239112
]
ASF GitHub Bot commented on NUTCH-2242:
---------------------------------------
Omkar20895 commented on issue #238: NUTCH-2242 Injector to stop if job fails to
avoid loss of CrawlDb
URL: https://github.com/apache/nutch/pull/238#issuecomment-341913968
Closing the PR as there was a typo in the commit and it has been assigned to
NUTCH-2242 rather than NUTCH-2442. Apologies.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> lastModified not always set
> ---------------------------
>
> Key: NUTCH-2242
> URL: https://issues.apache.org/jira/browse/NUTCH-2242
> Project: Nutch
> Issue Type: Bug
> Components: crawldb
> Affects Versions: 1.11
> Reporter: Jurian Broertjes
> Priority: Minor
> Fix For: 1.13
>
> Attachments: NUTCH-2242.patch
>
>
> I observed two issues:
> - When using the DefaultFetchSchedule, CrawlDatum's modifiedTime field is not
> updated on the first successful fetch.
> - When a document modification is detected (protocol- or signature-wise), the
> modifiedTime isn't updated
> I can provide a patch later today.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)