[
https://issues.apache.org/jira/browse/NUTCH-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432468#comment-15432468
]
Hudson commented on NUTCH-2164:
-------------------------------
FAILURE: Integrated in Jenkins build Nutch-trunk #3393 (See
[https://builds.apache.org/job/Nutch-trunk/3393/])
NUTCH-2164 NUTCH-2242 Inconsistent 'Modified Time' in crawl db / (snagel: rev
70622c3e18cee879f5a38d895f68dd0be69461e1)
* (edit) src/java/org/apache/nutch/crawl/DefaultFetchSchedule.java
* (edit) src/java/org/apache/nutch/protocol/ProtocolOutput.java
* (edit) src/java/org/apache/nutch/crawl/AdaptiveFetchSchedule.java
* (edit) src/test/org/apache/nutch/crawl/TestCrawlDbStates.java
> Inconsistent 'Modified Time' in crawl db
> ----------------------------------------
>
> Key: NUTCH-2164
> URL: https://issues.apache.org/jira/browse/NUTCH-2164
> Project: Nutch
> Issue Type: Improvement
> Components: crawldb, fetcher
> Affects Versions: 1.11
> Reporter: Thamme Gowda
> Priority: Minor
> Fix For: 1.13
>
>
> The 'Modified time' in crawldb is invalid. It is set to (0-Timezone
> Difference)
> *How to verify/reproduce:*
> Run 'nutch readdb /path/to/crawldb -dump yy' and then inspect content of
> 'yy'
> The following improvements can be done:
> 1. Set modified time by DefaultFetchSchedule
> 2. Set ProtocolStatus.lastModified if modified time is available in protocol
> response headers
> This issue is also discussed in dev mailing lists:
> http://www.mail-archive.com/[email protected]/msg19803.html#
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)