[ https://issues.apache.org/jira/browse/NUTCH-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516618 ]
Emmanuel Joke commented on NUTCH-532: ------------------------------------- res.getFetchTime() - Math.round(res.getFetchInterval() * 1000d); always give the last fetch time even in AdaptiveFetchSchedule. ** Btw, I've checked the AdaptiveFetchSchedule code and it seems we have also this famous convert error (Math.round(1000.0f * datum.getFetchInterval())). I will open a new JIRA for this. ** Regardings your suggestion to change the CrawlDatum object, I'm wondering few things: - why do we keep the fetchinterval as a FLOAT ? store as a Long could avoid to make this kind of conversion. - why do we use fetchinterval in second ? it seems that we always use the fetchinterval in milisecond, so we can maybe import it from the config files in second and store it in the crawldatum in milisecond. - Perhaps we can also create a new method (setFetchTimeBasedOnInterval(time) ), instead of repeating datum.setFetchTime(fetchTime + Math.round(datum.getFetchInterval() * 1000.0d)) in different class. don't u think ? Following your feedback, we could create another JIRA in order to improve the CrawlDatum object. > CrawlDbMerger: wrong computation of last fetch time > --------------------------------------------------- > > Key: NUTCH-532 > URL: https://issues.apache.org/jira/browse/NUTCH-532 > Project: Nutch > Issue Type: Bug > Reporter: Emmanuel Joke > Assignee: Emmanuel Joke > Fix For: 1.0.0 > > Attachments: NUTCH-532.patch > > > CrawlDbMerger.reduce analyse the last fetch time of each record and keep the > more recent record. > This comparison is based on a FetchInterval in days : resTime = > res.getFetchTime() - Math.round(res.getFetchInterval() * 3600 * 24 * 1000); > It was not really a noticeable as the Math.Round method return the > INTEGER.MAX_VALUE i.e 25 days. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. ------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Nutch-developers mailing list Nutch-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-developers