[ https://issues.apache.org/jira/browse/NUTCH-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Emmanuel Joke updated NUTCH-532: -------------------------------- Attachment: NUTCH-532_v4.patch I updated the code following Andrzej comments. I've also update the AbstractFetchSchedule to manage the old property of maxInterval. I noticed that you have removed the old property db.max.fetch.interval from nutch-default.xml. However the old property db.default.fetch.interval is still in the nutch-default.xml. I don't see the point to keep it. Why don't we remove it from the file ? > CrawlDbMerger: wrong computation of last fetch time > --------------------------------------------------- > > Key: NUTCH-532 > URL: https://issues.apache.org/jira/browse/NUTCH-532 > Project: Nutch > Issue Type: Bug > Reporter: Emmanuel Joke > Assignee: Emmanuel Joke > Fix For: 1.0.0 > > Attachments: NUTCH-532.patch, NUTCH-532_v2.patch, NUTCH-532_v3.patch, > NUTCH-532_v4.patch > > > CrawlDbMerger.reduce analyse the last fetch time of each record and keep the > more recent record. > This comparison is based on a FetchInterval in days : resTime = > res.getFetchTime() - Math.round(res.getFetchInterval() * 3600 * 24 * 1000); > It was not really a noticeable as the Math.Round method return the > INTEGER.MAX_VALUE i.e 25 days. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers