[ https://issues.apache.org/jira/browse/NUTCH-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sebastian Nagel updated NUTCH-2164: ----------------------------------- Fix Version/s: 1.13 > Inconsistent 'Modified Time' in crawl db > ---------------------------------------- > > Key: NUTCH-2164 > URL: https://issues.apache.org/jira/browse/NUTCH-2164 > Project: Nutch > Issue Type: Improvement > Components: crawldb, fetcher > Affects Versions: 1.11 > Reporter: Thamme Gowda N > Priority: Minor > Fix For: 1.13 > > > The 'Modified time' in crawldb is invalid. It is set to (0-Timezone > Difference) > *How to verify/reproduce:* > Run 'nutch readdb /path/to/crawldb -dump yy' and then inspect content of > 'yy' > The following improvements can be done: > 1. Set modified time by DefaultFetchSchedule > 2. Set ProtocolStatus.lastModified if modified time is available in protocol > response headers > This issue is also discussed in dev mailing lists: > http://www.mail-archive.com/dev@nutch.apache.org/msg19803.html# -- This message was sent by Atlassian JIRA (v6.3.4#6332)